I needed to get credit card statement amounts into a spreadsheet from pdf statements. After copying the statements to Textpad - delete the non-money lines. My statement lines look like this:
12/12 12/12 X46WSAQ2 FILENES BOSTON MA 1,923.71
Turn Posix format on. This will insert $ in front of all amounts at the end of lines
([-]*[0-9]*[0-9]*[0-9,][0-9]*[0-9]*[0-9]*[\.][0-9][0-9])$
$\1
This will accommodate amounts up to 99,999.99 as well as credits without losing the minus sign. Once that is done - delete the first column of dates, append ';' after remaining column of dates and precede all '$' with ';'. Then drop the statement lines into a spreadsheet and use the convert text to columns delimited by ';'
What is moved is as follows
12/12; X46WSAQ2 FILENES BOSTON MA ;$1,923.71
To transfer pdf credit card statements to spreadsheet
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
You can do all that in one go:
Note that * is the repetition operator. So
[0-9]*[0-9]*
is exactly equivalent to
[0-9]*
(except that it's slower) and matches any (possibly empty) sequence of digits.
The optionality operator is ?.
This handles amounts of any size.Find what: ^[0-9]{1,2}/[0-9]{1,2} ([0-9]{1,2}/[0-9]{1,2})(.* )(-?[0-9]{1,3}(,[0-9]{3})*\.[0-9]{2})$
Replace with: \1;\2;$\3
[X] Regular expression
Replace All
Note that * is the repetition operator. So
[0-9]*[0-9]*
is exactly equivalent to
[0-9]*
(except that it's slower) and matches any (possibly empty) sequence of digits.
The optionality operator is ?.
Thank you for the tip. I am fairly inexperienced doing this so when I figured it out I was quite pleased with myself.
Since this was posted I decided I did not want to carry forward the coded account item info - the 8 characters of info that appears after the 2nd date - so I changed to block mode to delete columns of these - however I am sure this can be done programmatically, if you have a suggestion. The catch is that if the date is Jan-Sep then the info shifts as below
2/12 2/12 EOFMPE94 FILENES BOSTON MA 1,923.71
12/12 12/12 X46WSAQ2 FILENES BOSTON MA 723.71
2/01 2/01 EOYYYE94 FILENES BOSTON MA -16.71
So the result should be
2/12; FILENES BOSTON MA ;$1,923.71
12/12; FILENES BOSTON MA ;$723.71
2/01; FILENES BOSTON MA ;$-16.71
I am not sure how to do this.
Since this was posted I decided I did not want to carry forward the coded account item info - the 8 characters of info that appears after the 2nd date - so I changed to block mode to delete columns of these - however I am sure this can be done programmatically, if you have a suggestion. The catch is that if the date is Jan-Sep then the info shifts as below
2/12 2/12 EOFMPE94 FILENES BOSTON MA 1,923.71
12/12 12/12 X46WSAQ2 FILENES BOSTON MA 723.71
2/01 2/01 EOYYYE94 FILENES BOSTON MA -16.71
So the result should be
2/12; FILENES BOSTON MA ;$1,923.71
12/12; FILENES BOSTON MA ;$723.71
2/01; FILENES BOSTON MA ;$-16.71
I am not sure how to do this.
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
You're welcome. Well done! It's good to see people trying to find a solution themselves before asking for help.
The regex [0-9]{1,2} matches one or two digits.
Try
The regex [0-9]{1,2} matches one or two digits.
Try
If you take it apart piece by piece, it's not as scary as it looks at first (although it will still look like line noise ).Find what: ^[0-9]{1,2}/[0-9]{1,2} ([0-9]{1,2}/[0-9]{1,2}) .{8}(.* )(-?[0-9]{1,3}(,[0-9]{3})*\.[0-9]{2})$
Replace with: \1;\2;$\3
[X] Regular expression
Replace All