To transfer pdf credit card statements to spreadsheet

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
raygg
Posts: 3
Joined: Tue Jul 01, 2003 6:20 pm
Location: Natick, MA,USA

To transfer pdf credit card statements to spreadsheet

Post by raygg »

I needed to get credit card statement amounts into a spreadsheet from pdf statements. After copying the statements to Textpad - delete the non-money lines. My statement lines look like this:

12/12 12/12 X46WSAQ2 FILENES BOSTON MA 1,923.71

Turn Posix format on. This will insert $ in front of all amounts at the end of lines
([-]*[0-9]*[0-9]*[0-9,][0-9]*[0-9]*[0-9]*[\.][0-9][0-9])$
$\1
This will accommodate amounts up to 99,999.99 as well as credits without losing the minus sign. Once that is done - delete the first column of dates, append ';' after remaining column of dates and precede all '$' with ';'. Then drop the statement lines into a spreadsheet and use the convert text to columns delimited by ';'

What is moved is as follows

12/12; X46WSAQ2 FILENES BOSTON MA ;$1,923.71
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

You can do all that in one go:
Find what: ^[0-9]{1,2}/[0-9]{1,2} ([0-9]{1,2}/[0-9]{1,2})(.* )(-?[0-9]{1,3}(,[0-9]{3})*\.[0-9]{2})$
Replace with: \1;\2;$\3

[X] Regular expression

Replace All
This handles amounts of any size.

Note that * is the repetition operator. So
[0-9]*[0-9]*
is exactly equivalent to
[0-9]*
(except that it's slower) and matches any (possibly empty) sequence of digits.

The optionality operator is ?.
raygg
Posts: 3
Joined: Tue Jul 01, 2003 6:20 pm
Location: Natick, MA,USA

Post by raygg »

Thank you for the tip. I am fairly inexperienced doing this so when I figured it out I was quite pleased with myself.

Since this was posted I decided I did not want to carry forward the coded account item info - the 8 characters of info that appears after the 2nd date - so I changed to block mode to delete columns of these - however I am sure this can be done programmatically, if you have a suggestion. The catch is that if the date is Jan-Sep then the info shifts as below

2/12 2/12 EOFMPE94 FILENES BOSTON MA 1,923.71
12/12 12/12 X46WSAQ2 FILENES BOSTON MA 723.71
2/01 2/01 EOYYYE94 FILENES BOSTON MA -16.71

So the result should be

2/12; FILENES BOSTON MA ;$1,923.71
12/12; FILENES BOSTON MA ;$723.71
2/01; FILENES BOSTON MA ;$-16.71

I am not sure how to do this.
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

You're welcome. Well done! It's good to see people trying to find a solution themselves before asking for help.

The regex [0-9]{1,2} matches one or two digits.

Try
Find what: ^[0-9]{1,2}/[0-9]{1,2} ([0-9]{1,2}/[0-9]{1,2}) .{8}(.* )(-?[0-9]{1,3}(,[0-9]{3})*\.[0-9]{2})$
Replace with: \1;\2;$\3

[X] Regular expression

Replace All
If you take it apart piece by piece, it's not as scary as it looks at first (although it will still look like line noise :-)).
Post Reply