Skip to content

Commit

Permalink
NoVAD added
Browse files Browse the repository at this point in the history
  • Loading branch information
nick-sh-oh committed Jan 10, 2024
1 parent e48c342 commit 6582754
Show file tree
Hide file tree
Showing 8 changed files with 13,996 additions and 32 deletions.
29 changes: 15 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,20 +50,21 @@ The predefined lexicon identifiers follow the convention {NAME}_{VERSION} - for

See below for the available predefined lexicon identifier.

| Sentiment Dictionary | Associated Institution <br> (Principal Investigator) | Description | Genre | Domain | Predefined Identifiers (preprocessed) |
|------------------------|---------------------|---------------|------|-----|------------------------|
|**AFINN** <br> (Nielsen, 2011)| DTU Informatics <br> (Technical University of Denmark) | General purpose lexicon with sentiment ratings for common emotion words. |Social Media|General| `AFINN_v2009`, `AFINN_v2011`, `AFINN_v2015` |
|**Aigents+** <br> (Raheman et al., 2022)| Autonio Foundation | Lexicon optimised for social media posts related to cryptocurrencies. |Social Media|Cryptocurrency| `Aigents+_v2022`|
|**ANEW** <br> (Bradley and Lang, 1999)| NIMH Center for Emotion and Attention <br> (University of Florida) | Provides normative emotional ratings across pleasure, arousal, and dominance dimensions.|General|Psychology|`ANEW_v1999_simple`, `ANEW_v1999_weighted`|
|**Dictionary of Affect in Language (DAL)** <br> (Whissell, 1989; Whissell, 2009)| Laurentian University | Lexicon designed to quantify pleasantness, activation, and imagery dimensions across diverse everyday English words. | General | General | `DAL_v2009_norm`, `DAL_v2009_boosted`|
|**Discrete Emotions Dictionary (DED)** <br> (Fioroni et al., 2022)| Gallup | Lexicon focused on precisely distinguishing four key discrete emotions in political communication | News | Political Science | `DED_v2022` |
|**General Inquirer** <br> (Stone et al., 1962)| Harvard University | Lexicon capturing broad psycholinguistic dimensions across semantics, values and motivations. |General|Psychology, Political Science| `HarvardGI_v2000`|
|**Henry** <br> (Henry, 2006) | University of Miami | Leixcon designed for analysing tone in earnings press releases. |Corporate Communication (Earnings Press Releases)|Finance| `Henry_v2006`|
|**MASTER** <br> (Loughran and McDonland, 2011; Bodnaruk, Loughran and McDonald, 2015)| University of Notre Dame | Financial lexicons covering expressions common in business writing. |Regulatory Filings (10-K)|Finance| `MASTER_v2022`|
|**OpinionLexicon** <br> (Hu and Liu, 2004)| University of Illinois Chicago | Opinion words tailored for sentiment analysis of product reviews.|Product Reviews|Consumer Products|`OpinionLexicon_v2004`|
|**SentiWordNet** <br> (Esuli and Sebastiani, 2006; Baccianella, Esuli and Sebastiani, 2010)| Institute of Information Science and Technologies <br>(Consiglio Nazionale delle Ricerche) | Lexicon associating WordNet synsets with positive, negative, and objective scores. |General|General| `SentiWordNet_v2010_simple`, `SentiWordNet_v2010_nuanced` |
|**VADER** <br> (Hutto and Gilbert, 2014)| Georgia Institute of Technology | General purpose lexicon optimised for social media and microblogs. |Social Media|General| `VADER_v2014`|
|**WordNet-Affect** <br> (Strapparava and Valitutti, 2004; Valitutti, Strapparava and Stock, 2004; Strapparava, Valitutti and Stock, 2006)| Institute for Scientific and Technological Research <br> (Fondazione Bruno Kessler) | Hierarchically organised affective labels providing a granular emotional dimension. |General|Psychology| `WordNet-Affect_v2006`|
| Sentiment Dictionary | Description | Genre | Domain | Predefined Identifiers (preprocessed) |
|------------------------|---------------|------|-----|------------------------|
|**AFINN** <br> (Nielsen, 2011)| General purpose lexicon with sentiment ratings for common emotion words. |Social Media|General| `AFINN_v2009`, `AFINN_v2011`, `AFINN_v2015` |
|**Aigents+** <br> (Raheman et al., 2022)| Lexicon optimised for social media posts related to cryptocurrencies. |Social Media|Cryptocurrency| `Aigents+_v2022`|
|**ANEW** <br> (Bradley and Lang, 1999)| Provides normative emotional ratings across pleasure, arousal, and dominance dimensions.|General|Psychology|`ANEW_v1999_simple`, `ANEW_v1999_weighted`|
|**Dictionary of Affect in Language (DAL)** <br> (Whissell, 1989; Whissell, 2009)| Lexicon designed to quantify pleasantness, activation, and imagery dimensions across diverse everyday English words. | General | General | `DAL_v2009_boosted`, `DAL_v2009_norm` |
|**Discrete Emotions Dictionary (DED)** <br> (Fioroni et al., 2022)| Lexicon focused on precisely distinguishing four key discrete emotions in political communication | News | Political Science | `DED_v2022` |
|**General Inquirer** <br> (Stone et al., 1962)| Lexicon capturing broad psycholinguistic dimensions across semantics, values and motivations. |General|Psychology, Political Science| `HarvardGI_v2000`|
|**Henry** <br> (Henry, 2006) | Leixcon designed for analysing tone in earnings press releases. |Corporate Communication (Earnings Press Releases)|Finance| `Henry_v2006`|
|**MASTER** <br> (Loughran and McDonland, 2011; Bodnaruk, Loughran and McDonald, 2015)| Financial lexicons covering expressions common in business writing. |Regulatory Filings (10-K)|Finance| `MASTER_v2022`|
|**Norms of Valence, Arousal and Dominance (NoVAD)** <br> (Warriner, Kuperman and Brysbaert, 2013; Warriner and Kuperman, 2014)| A lexicon of 14,000 common English lemmas across valence, arousal, and dominance dimensions. | General | Psychology | `NoVAD_v2013_adjusted`, `NoVAD_v2013_bidimensional`|
|**OpinionLexicon** <br> (Hu and Liu, 2004)| Opinion words tailored for sentiment analysis of product reviews.|Product Reviews|Consumer Products|`OpinionLexicon_v2004`|
|**SentiWordNet** <br> (Esuli and Sebastiani, 2006; Baccianella, Esuli and Sebastiani, 2010)| Lexicon associating WordNet synsets with positive, negative, and objective scores. |General|General| `SentiWordNet_v2010_logtransform`, `SentiWordNet_v2010_simple`|
|**VADER** <br> (Hutto and Gilbert, 2014)| General purpose lexicon optimised for social media and microblogs. |Social Media|General| `VADER_v2014`|
|**WordNet-Affect** <br> (Strapparava and Valitutti, 2004; Valitutti, Strapparava and Stock, 2004; Strapparava, Valitutti and Stock, 2006)| Hierarchically organised affective labels providing a granular emotional dimension. |General|Psychology| `WordNet-Affect_v2006`|

Refer [documentation](docs_link) for details on usage.

Expand Down
53 changes: 40 additions & 13 deletions doc/licenses.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ https://github.com/aigents/aigents-java

#ANEW

?
Publicly Avaliable
(Report)


#DED (Discrete Emotions Dictionary)
Expand All @@ -27,10 +28,13 @@ https://osf.io/cbm9e/

https://github.com/CenterForOpenScience/cos.io/blob/master/TERMS_OF_USE.md

Publicly Avaliable
(Open Science Framework)

#Dictionary of Affect

?
#DAL (Dictionary of Affect in Language)

Publicly Avaliable


#General Inquirer
Expand All @@ -44,7 +48,8 @@ https://inquirer.sites.fas.harvard.edu/spreadsheet_guide.htm

#Henry

Unspecified
Publicly avaliable
(SSRN)


#MASTER
Expand All @@ -54,11 +59,6 @@ For commercial licenses, please contact us at loughranmcdonald@gmail.com.
https://sraf.nd.edu/loughranmcdonald-master-dictionary/


#MPQA

?


#News PMI

?
Expand Down Expand Up @@ -87,6 +87,7 @@ https://github.com/consose/SentiBigNomics

#SenticNet

MIT License
https://sentic.net/downloads/


Expand Down Expand Up @@ -120,9 +121,6 @@ You are free to share and adapt the lexicon, as long as you give attribution to
https://wndomains.fbk.eu/download.html





#EmoLex

Custom License
Expand Down Expand Up @@ -170,4 +168,33 @@ https://arxiv.org/abs/1406.0032


#SentiStrength
http://sentistrength.wlv.ac.uk/
http://sentistrength.wlv.ac.uk/


#MPQA

The annotations in this data collection are copyrighted by the MITRE Corporation.
User acknowledges and agrees that:
(i) as between User and MITRE, MITRE owns all the right, title and interest in the Annotated Content, unless expressly stated otherwise;
(ii) nothing in this Agreement shall confer in User any right of ownership in the Annotated Content; and
(iii) User is granted a non-exclusive, royalty free, worldwide licence (with no right to sublicense) to use the Annotated Content solely for academic and research purposes.
This Agreement is governed by the law of the Commonwealth of Massachusetts and User agrees to submit to the exclusive jurisdiction of the Massachusetts courts.


#+/-Effect Lexicon

GNU General Public License


#Warriner

Publicly avaliable
(Springer)
https://www.springernature.com/gp/authors/research-data-policy/data-availability-statements


#HSSWE

https://github.com/NUSTM/HSSWE/tree/master
https://aclanthology.org/D17-1052/
CC BY 4.0 DEED
26 changes: 23 additions & 3 deletions sentibank/archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ def dict(self, idx: str):
with open(file_path, "rb") as handle:
self.lex_dict = pickle.load(handle)

elif idx == "SentiWordNet_v2010_nuanced":
elif idx == "SentiWordNet_v2010_logtransform":
file_path = os.path.join(
self.script_dir, "dict_arXiv", "SentiWordNet", "SentiWordNet_v2010_nuanced.pickle"
self.script_dir, "dict_arXiv", "SentiWordNet", "SentiWordNet_v2010_logtransform.pickle"
)
with open(file_path, "rb") as handle:
self.lex_dict = pickle.load(handle)
Expand Down Expand Up @@ -157,6 +157,20 @@ def dict(self, idx: str):
)
with open(file_path, "rb") as handle:
self.lex_dict = pickle.load(handle)

elif idx == "NoVAD_v2013_bidimensional":
file_path = os.path.join(
self.script_dir, "dict_arXiv", "NoVAD", "NoVAD_v2013_bidimensional.pickle"
)
with open(file_path, "rb") as handle:
self.lex_dict = pickle.load(handle)

elif idx == "NoVAD_v2013_adjusted":
file_path = os.path.join(
self.script_dir, "dict_arXiv", "NoVAD", "NoVAD_v2013_adjusted.pickle"
)
with open(file_path, "rb") as handle:
self.lex_dict = pickle.load(handle)

else:
raise ValueError
Expand Down Expand Up @@ -261,7 +275,13 @@ def origin(self, idx: str):
self.script_dir, "dict_arXiv", "DAL", "DAL_v2009.csv"
)
self.origin_df = pd.read_csv(file_path)


elif idx == "NoVAD_v2013":
file_path = os.path.join(
self.script_dir, "dict_arXiv", "NoVAD", "NoVAD_v2013.csv"
)
self.origin_df = pd.read_csv(file_path)

else:
raise ValueError

Expand Down
Loading

0 comments on commit 6582754

Please sign in to comment.