Skip to content

Inverted index #2

@dwhyte4

Description

@dwhyte4

Hi, i tried running your code to see the output but it didn't work.
The errors:
The variable 'reduce' is not defined
the two inverted functions for documents the dictionary was not iterable.
do you have an idea why its not working?
I put in my code below as i am trying to build an inverted index similar to yours and also want to run a query by inputing words to search for. I also want to try the program just opening a text file for queries as well. any advise on how i can move forward
` from collections import Counter
import re #I used reg expressions to easily remove unwanted characters
import math
import buildindex

file = open('docs.txt', 'r')

.lower() returns a version with all upper case characters replaced with lower case characters.

tex = file.read().lower()
file.close()

replaces anything that is not a lowercase letter, a space, or an apostrophe with a space:

text = re.sub('[^a-z\ \']+', " ",tex)#For some reason, even though the text is in lower case, the code does't work unless i redo that condition
words = list(text.split()) # put text into an empty list using split()
Count = Counter(words) # counts the seperated words by assigning a number to them
Total = sum(Count.values()) #shows the total of all the words used!

print("Words in dictionary: " )
dictionary = {}
for i in words:

if i in dictionary :
  dictionary[i] += 1
else:
  dictionary[i] = 1

print(len(dictionary)) #checks for words used more than once and represents it as one word, then prints out the total

print(dictionary)

Take input

query = input(" Query : ")
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions