Difficulty Opening a Large Text File

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
Lizette
Posts: 2
Joined: Wed Nov 19, 2003 9:08 pm
Location: Washington, DC

Difficulty Opening a Large Text File

Post by Lizette »

We are having difficulty opening a large text file and are looking for assistance. According to the Textpad online help, "The editor can handle file sizes up to the limit of virtual memory." In the past, this has been the only limitation and increasing the amount of virtual memory on a machine has allowed us to open files upwards of one Gb. For this file, I had LAN Support increase the size of my maximum paging file from 1300 to 2300Mb, which should be large enough for this file. I also have approximately 5.6Gb of free space on this drive, so theoretically there shouldn't be a problem. Please advise.
User avatar
Bob Hansen
Posts: 1516
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

It sounds like you have not had a problem in the past, so this may not be of help. but, I seem to recall that large files (especially unicode) were a problem in earlier versions, probably over two years ago. That was corrected with a newer release, probably over 2 years ago also.

Are you using the latest version, 4.7.2?
Hope this was helpful.............good luck,
Bob
User avatar
Lizette
Posts: 2
Joined: Wed Nov 19, 2003 9:08 pm
Location: Washington, DC

Difficulty Opening a Large Text File

Post by Lizette »

we're running 4.6.2.
User avatar
bbadmin
Site Admin
Posts: 939
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

A consequence of the 32-bit address model of Windows is that the largest contiguous address space in virtual memory is 2^31 -1 bytes (2147483647 bytes or 2GB). Files can be larger, but must be processed in 2GB chunks. Editing such files is a fairly unusual requirement, which would add a lot of complexity to the way TextPad works, so we decided not to implement that capability.

Keith MacDonald
Helios Software Solutions
Larry
Posts: 2
Joined: Thu Nov 20, 2003 4:45 pm
Location: Washington, DC

Post by Larry »

Well, we're under the 2Gb limit mentioned above, though not by much. File size is 1,946,587,671 bytes. The error we receive is that the disk is full when opening the file.

[img]C:\WINNT\Profiles\Lie_L\My Documents\My Pictures\Textpad_Error1.jpg[/img]

As Lyzette mentioned above, we've increased the size of the paging file and their seems to be ample free space on this hard drive. Is the file size too close to the 2Gb limit? If processed in chunks, does that mean that the first 2Gb would be loaded? Any further help or ideas would be appreciated!

BTW, we just dowloaded and installed 4.7.2, but still receive the same message.
User avatar
bbadmin
Site Admin
Posts: 939
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

The most likely cause is that there is not a big enough contiguous chunk of free VM to map the file into. It's possible that Windows operates with some overhead that prevents it allocating right up to the 2GB limit, but you could try defragging VM using PageDefrag from http://www.sysinternals.com.

Keith MacDonald
Helios Software Solutions
User avatar
talleyrand
Posts: 624
Joined: Mon Jul 21, 2003 6:56 pm
Location: Kansas City, MO, USA
Contact:

Post by talleyrand »

Just out of curiosity, is it neccesary to use the whole file at once Liz? It's a bit of a kludge, but you could use wc, head and tail (unix utilties) to split the file into halves. That'd drop you down to a paltry Gig of text (which I'll admit my mind struggles to phathom) which would be within TP tolerances. Recombine them when you have completed editing.
I choose to fight with a sack of angry cats.
Larry
Posts: 2
Joined: Thu Nov 20, 2003 4:45 pm
Location: Washington, DC

Post by Larry »

We do not have expertise at this level of Windows and our unit is unfamiliar with Unix. Textpad itself would presumably allow us to cut the file if we could only load it.

Following the lead we received here, we tried installing the pagefile defragmentation utility from SysInternals, but it would not install in XP. Note that it's freeware, and support is not exactly promised. But after writing to the developers with an explanation of our probelm, we received the following response. Please note, we are not trying to start a conflict, just find a solution. All help thus far has been greatly appreciated. Here's the message back from the SysInternals developers:

Helios is incorrect. Virtual memory is always contiguous, regardless of
whether the backing store is contiguous on disk. Having a contiguous
page file can have a (tiny) effect on performance, but it won't impact
an application's ability to perform a given task.
--
Bryce Cogswell
Winternals Software


Larry Lie - BLS wrote:

> Hi Bryce,
>
> We were referred to your paging file defragmenter by the folks at
> Helios Software when we could not load a large text file into their
> Textpad text file editor. The limitation, according to Helios, was
> that the amount of virtual memory we wanted to use had to be contiguous.
>
> However, we are having difficulty trying to install and run PageDefrag
> from your Sysinternals website. Please let us know which link would
> be suitable for Windows XP on a Intel P4 machine and how it should be
> installed. Any help would be appreciated! Thanks.
>
> Larry Lie
> Bureau of Labor Statistics
User avatar
talleyrand
Posts: 624
Joined: Mon Jul 21, 2003 6:56 pm
Location: Kansas City, MO, USA
Contact:

Post by talleyrand »

The unix part is pretty easy. Grab Cygwin and put a reference to the bin directory (c:\cygwin\bin) in my path. I was going to say do a wc -l file name and then apply head and tail with half the word count to each file but I thought this was too common a thing to try and roll my own. I was right.
csplit is the unix utility you want (and far cooler than the Python script I wrote)

Assuming myBigFile.txt is 2000 lines
c:\>csplit -f myBigFileTmp.txt myBigFile.txt 1000
would then create
myBigFileTmp.txt00
myBigFileTmp.txt01

To find the number of lines in a file, you can either use the unix utility wc with the -l parameter or use the NT FIND command with /c (for count) and /v "" (for lines that don't contain the empty string)
So c:\>wc -l myBigFile.txt OR c:\>type myBigFile.txt | find /c /v ""


The beauty of the csplit solution is that it's able to split it out into X number of files in just about any way you see fit (regular expressions included).

Putting them back together is trivial.
c:\>copy /y /a myBigFileTmp.txt* myBigFile.txt

Granted, I know this doesn't address the larger issue you are having but I'm certainly not qualified to talk about what's going on, I just provide band-aids. Perhaps someone's got enough DOS batch-fu skills to write a thing to get the line count of a file, call the csplit program using the filename and the number of lines /2 and then probably a script to rejoin the files and clean up temporary files. I piddled around with it but I couldn't seem to get it working right.
I choose to fight with a sack of angry cats.
User avatar
Bob Hansen
Posts: 1516
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

You may want to try out SC-SplitMerge from http://www.soft-central.net/splitmerge.php. Freeware. Command line utilities for Microsoft OS.

SC-SplitMerge is a set of two executables, SC-Split.exe and SC-Merge.exe.

With SC-Split any file can be splitted into parts with user-specified sizes. This is handy for dealing with for huge files. Another application of SC-Split is breaking a file into floppy disk size file parts. Use SC-Split to copy your huge file to multiple disks.

SC-Splitmerge is command line based (it has no UI).

This could be used with Macro Scheduler where you could do the calculations necessary to split the files, call SC-Split, Open in TextPad, Save, Close TextPad, and call SC-Merge to tie them all back together.
Hope this was helpful.............good luck,
Bob
Post Reply