Problem
Write a bash script to calculate the frequency of each word in a text file words.txt.
For simplicity sake, you may assume:
words.txtcontains only lowercase characters and space' 'characters.- Each word must consist of lowercase characters only.
- Words are separated by one or more whitespace characters.
Example:
Assume that words.txt has the following content:
1 | |
Your script should output the following, sorted by descending frequency:
1 | |
Note:
- Don’t worry about handling ties, it is guaranteed that each word’s frequency count is unique.
- Could you write it in one-line using Unix pipes?
Explanation
-
grep -oE '[a-z]+' words.txtprint the text file’s each word in a line.1
2
3
4
5
6
7
8
9the day is sunny the the the sunny is -
sortsorting the output1
2
3
4
5
6
7
8
9
10day is is is sunny sunny the the the the -
uniqdelete duplicated line, option-cmeans before each word, add its occurance.1
2
3
41 day 3 is 2 sunny 4 the -
sort -nrmeans sorting by number of occurance.1
2
3
44 the 3 is 2 sunny 1 day -
awk '{print $2 " " $1}'print the output, reverse the number and the word.1
2
3
4the 4 is 3 sunny 2 day 1
Solution
1 | |