Skip to content

Issue with sentiment analysis #1

@EmilyForden

Description

@EmilyForden

I'm running into 2 issues while trying to run sentiment analysis on the ancient writer Livy. The first is that Livy's book is divided into 4 sections, which must be downloaded independently. I can only get part 1 and 4 to download.

The second problem is causing me more distress. I'm trying to run sentiment analysis on the texts using nrc but I keep being thrown an issue with the 'by' argument. I think this is because my inner-join is incorrect. Is my mutate line incorrect? I think it might be since nrc isn't simply a positive-negative sentiment but a multi-faceted analysis.

Thanks!

titles <- c("The History of Rome, Books 01 to 08", "The History of Rome, Books 09 to 26",
            "The History of Rome, Books 27 to 36", "The History of Rome, Books 37 to the End
            with the Epitomes and Fragments of the Lost Books")

books <- gutenberg_works(title %in% titles) %>%
  gutenberg_download(meta_fields = "title")

get_sentiments("nrc") %>%
  count(sentiment)

tidy_books <- books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]", 
                                                 ignore_case = TRUE)))) %>%
  ungroup() %>%
  unnest_tokens(word, text)

Livysentiment <- books %>%
  inner_join(get_sentiments("nrc")) %>%
  count(book, index = linenumber %/% 80, sentiment) %>%
  spread(sentiment, n, fill = 0) %>%
  mutate(sentiment = positive - negative)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions