Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
137 views
in Technique[技术] by (71.8m points)

Need to write a python program to print words which are repeated number of times a number given as input from a text file of larger size

Need to write a python program to print "The words which are repeated in a text file of larger size, giving an input a numerical value"

example task.txt [apple basket apple orange apple basket lemon basket] textfile contains different words separated by space


my python program
with open('task.txt') as f:
    string = f.read()
  
string = string.lower();  
n = int(input("enter number :"))
    
words = string.split(" ");  
   
print("the words which are repeated given number of times : ");  
for i in range(0, len(words)):  
    count = 1;  
    for j in range(i+1, len(words)):  
        if(words[i] == (words[j])):  
            count = count + 1;   
            words[j] = "0";    
    if(count == n and words[i] != "0"):  
        print(words[i]); 
    else:
        print('no word found')
        break;

output

enter number :3
the words which are repeated at a given number of times : 
apple
basket

code is working fine but the issue is how to read data in chunks if the file size is 100 GB

question from:https://stackoverflow.com/questions/66052330/need-to-write-a-python-program-to-print-words-which-are-repeated-number-of-times

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Read the file a line at a time instead of reading the whole thing.

You can use collections.Counter() to get the counts of each word, then go through this to find all the ones that were repeated at least n times.

import collections

word_counts = collections.Counter()

with open('task.txt') as f:
    for line in f:
        word_counts.update(line.lower().split())

n = int(input("enter number :"))
print("the words which are repeated given number of times : ");  
words = [word for word, count in word_counts.items() if count >= n]
if len(words) == 0:
    print('no word found')
else:
    print(*words, sep = '
')

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...