Multi-lingual author profiling using stylistic features

Abdul Sittar, Iqra Ameer

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations

Abstract

Author profiling is the identification of an author’s traits by examining the text written by an author. Author profiling has many useful applications in security, criminal, marketing, identification of false accounts on shared communication websites, etc. We submitted our system to the FIRE'18-MAPonSMS (Multi-lingual Author Profiling on SMS), a shared task to classify the attributes of an author like gender and age group from multilingual text specifically English +Roman Urdu. Roman Urdu is common language specifically in SMS messaging, Facebook posts/comments and chats blog of games etc. Our presented system is based on 29 different stylistic features. On the training dataset, we have achieved best Accuracy = 73.714, for gender while using all 14-language-inde-pendent features together and Accuracy = 58.571 for age group by using all 29 features together. We obtained Accuracy = 0.55 and 0.37 on testing data for both gender and age respectively.

Original languageEnglish (US)
Pages (from-to)240-246
Number of pages7
JournalCEUR Workshop Proceedings
Volume2266
StatePublished - 2018
Event10th Working Notes of FIRE - Forum for Information Retrieval Evaluation, FIRE-WN 2018 - Gandhinagar, India
Duration: Dec 6 2018Dec 9 2018

All Science Journal Classification (ASJC) codes

  • General Computer Science

Cite this