- 论坛徽章:
- 0
|
刚开始的思路是:
将整个文件读取,然后按照空格切割后保存于数组中,然后遍历数组创建哈希表。但是如果文章很长,并且有多个文章的话,
先保存数组有点不太妥,效率太低,请问如何改进,使得当读入文件的时候不创建临时数组直接创建哈希表呢?
text_in:
The U.N. Food and Agriculture Organization says it has less than half the funding it needs to help ensure food security in parts of South Sudan.
.......
(太多先不贴出来了,假设文本很规范)
创建如下的哈希表%Words:
(
The => 1,
U.N. => 1,
Food => 1,
...
)
我之前的想法是:
my $content;
{
local $/= undef;
$content = <$IN1>;
close($IN1);
#print "$content\n";
}
my @words1 = split /\s/,$content;
my %Words1 = map{$_ => 1} @words1;
可不可以不用临时的数组呢,直接创建哈希表,那样会不会更快呢? |
|