I need to extract orders from a PDF file, I have converted the PDF into text but I am having trouble understanding Expressions could someone give me a small example of how to build an expression that would look for a block of text held on different lines.
ORDER NUMBER : SO773175 Ship Date: 23-Nov-15 Style Desc : CURTAINS CR 46X54 Linecode : J855566 Qty 36
It doesn't matter if I just save the values after the : or the whole block, the block of text is repeated for each individual order so could be 5 or could be 50 orders in one file, but these blocks are only repeated once in the entire file.
I suspect that you're having problems with the multiple lines, \n is the regex newline character, so, if you are using a regex engine that does perl like regular expressions (most of them), then this should work.
ORDER NUMBER :\s+[^\s]+\s+Ship Date:\s+[^\s]+\n\n\s+Style Desc : .+\n\s+Linecode : .+\n\s+Qty\s+.+
I would recommend https://regex101.com/, or any of the other regex testing sites out there as a good place to test out creating regex expressions.