Perl Out Of Memory Error
Contents |
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about hiring developers or out of memory error while running perl script posting ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask Question x Dismiss perl out of memory windows Join the Stack Overflow Community Stack Overflow is a community of 6.2 million programmers, just like you, helping each other. Join them; it only perl out of memory reading large file takes a minute: Sign up Resolving Out of Memory error when executing Perl script up vote 2 down vote favorite I'm attempting to build a n-gram language model based on the top 100K words found in the english language wikipedia
How To Solve Out Of Memory Error In Perl
dump. I've already extracted out the plain text with a modified XML parser written in Java, but need to convert it to a vocab file. In order to do this, I found a perl script that is said to do the job, but lacks instructions on how to execute. Needless to say, I'm a complete newbie to Perl and this is the first time I've encountered a need for its usage. When I run this script, I'm getting an Out of perl ulimit Memory Error when using this on a 7.2GB text file on two separate dual core machines with 4GB RAM and runnung Ubuntu 10.04 and 10.10. When I contacted the author, he said this script ran fine on a MacBook Pro with 4GB RAM, and the total in-memory usage was about 78 MB when executed on a 6.6GB text file with perl 5.12. The author also said that the script reads the input file line by line and creates a hashmap in memory. The script is: #! /usr/bin/perl use FindBin; use lib "$FindBin::Bin"; use strict; require 'english-utils.pl'; ## Create a list of words and their frequencies from an input corpus document ## (format: plain text, words separated by spaces, no sentence separators) ## TODO should words with hyphens be expanded? (e.g. three-dimensional) my %dict; my $min_len = 3; my $min_freq = 1; while (<>) { chomp($_); my @words = split(" ", $_); foreach my $word (@words) { # Check validity against regexp and acceptable use of apostrophe if ((length($word) >= $min_len) && ($word =~ /^[A-Z][A-Z\'-]+$/) && (index($word,"'") < 0 || allow_apostrophe($word))) { $dict{$word}++; } } } # Output words which occur with the $min_freq or more often foreach my $dictword (keys %dict) { if ( $dict{$dictword} >= $min_freq ) { print $dictword . "\t" . $dict{$dictword} . "\n"; } } I'm executing this script from the command line via mkvocab.pl corpus.txt The included extra script is simply a regex script to
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and perl catch out of memory error policies of this site About Us Learn more about Stack Overflow the
Perl Memory Usage
company Business Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation Tags Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 6.2 million programmers, just like you, helping each other. Join them; it only takes http://stackoverflow.com/questions/8128774/resolving-out-of-memory-error-when-executing-perl-script a minute: Sign up Why does my Perl script die with an “out of memory” exception? up vote 1 down vote favorite I need to read a 200mb "space"-separated file line-by-line and collect its contents into an array. Every time I run the script, Perl throws an "out of memory" exception, but I don't understand why! Some advice please? #!/usr/bin/perl -w http://stackoverflow.com/questions/2201432/why-does-my-perl-script-die-with-an-out-of-memory-exception use strict; use warnings; open my $fh, "<", "../cnai_all.csd"; my @parse = (); while (<$fh>) { my @words = split(/\s/,$_); push (@parse, \@words); } print scalar @parse; the cnai file looks like this: it contains 11000 rows and 4200 values, seperated by "space", per line. VALUE_GROUP_A VALUE_GROUP_B VALUE_GROUP_C VALUE_GROUP_A VALUE_GROUP_B VALUE_GROUP_C VALUE_GROUP_A VALUE_GROUP_B VALUE_GROUP_C VALUE_GROUP_A VALUE_GROUP_B VALUE_GROUP_C The code above is just a stripped down sample.The final script will store all values in a hash and write it to a database later . But first, I have to solve that memory problem! database perl memory share|improve this question edited Aug 17 '10 at 2:21 Greg Bacon 75.8k18148197 asked Feb 4 '10 at 16:45 Floopy-Doo 63117 Code said while (<$fh>) but it was not displayed in the markup. –mob Feb 4 '10 at 16:50 @floppy-doo Please edit your question to give us an idea what the contents of cnai_all.csd look like? –Greg Bacon Feb 4 '10 at 16:55 See also: stackoverflow.com/questions/1663498/finding-a-perl-memory-leak –Ether Feb 4 '10 at 21:53 If all you want is the n
PERL Beginners I wrote a small script that uses message ID's as unique values and extracts recipient address info. The goal is to count 1019 events per message ID. It also gets the sum http://www.justskins.com/forums/out-of-memory-error-115248.html of recipients per message ID. The script works fine but when it runs against a very large file (2GB+) I receive an out of memory error. Is there a more efficient way of handling the hash portion that is less memory intense and preferably faster? --Paul # Tracking log pr use strict; my $recips; my %event_id; my $counter; my $total_recips; my $count; # Get log file die "You ... Thread Tools out of Show Printable Version Email this Page… Subscribe to this Thread… Display Linear Mode Switch to Hybrid Mode Switch to Threaded Mode December 16th,07:15 PM #1 Out of memory error problem I wrote a small script that uses message ID's as unique values and extracts recipient address info. The goal is to count 1019 events per message ID. It also gets the sum of recipients per message ID. The script works out of memory fine but when it runs against a very large file (2GB+) I receive an out of memory error. Is there a more efficient way of handling the hash portion that is less memory intense and preferably faster? --Paul # Tracking log pr use strict; my $recips; my %event_id; my $counter; my $total_recips; my $count; # Get log file die "You must enter a tracking log. \n" if $#ARGV <0; my $logfile = shift; open (LOGFILE, $logfile) || die "Unable to open $logfile because\n $!\n"; foreach (