DefaultRecordSeparatorPolicy and Unterminated Double Quotes

Last winter I got the opportunity to lead the development of a batch application that would run on our mid range servers. This gave me the opportunity to explore using Spring Batch to load large flat files from clients into our system.  The exposure would be small since we were really only loading a few files.  The process had been running great until last week.

Last we we got a file that cause an OutOfMemory error when it was processed.  After looking at the issue it was noticed a double quote was in the name field on the file.  Once I removed that single double quote the process successfully loaded the data in a test environment.  Now the file contained double quotes in other parts of the file so I didn't understand why this one caused the file to fail.

After attaching the source for Spring Batch I was able to walk through the code to see what was going on.  Eventually  I got into my FlatFileItemReader and found it referencing a RecordSeperatorPolicy to look for the end of a record.  The DefaultRecordSeparatorPolicy will look for an unterminated double quote and if found will basically puke on the record to the point where it will never find an end of record.  Since this is the default that really is a problem if your client send you a miscellaneous double quote.

The solution was to use a different RecordSeparatorPolicy in my FlatFileItemReader class. Thankfully Spring Batch offers another class called SimpleRecordSeparatorPolicy which doesn't care about a end of line marker.  After making this change in my code I was able to load the original file in my test environment with no issues.  I'm wondering if this has been noticed by others using Spring Batch.  I think this really makes a case for testing corrupt files in the QA phase just to see what would possibly happen.

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

 Name (required)

 Email (will not be published) (required)

Your comment is 0 characters limited to 3000 characters.