Close

1. Identity statement
Reference TypeJournal Article
Sitemtc-m21c.sid.inpe.br
Holder Codeisadg {BR SPINPE} ibi 8JMKD3MGPCW/3DT298S
Identifier8JMKD3MGP3W34R/43BQLBH
Repositorysid.inpe.br/mtc-m21c/2020/10.02.16.01   (restricted access)
Last Update2020:10.02.16.01.04 (UTC) simone
Metadata Repositorysid.inpe.br/mtc-m21c/2020/10.02.16.01.04
Metadata Last Update2022:01.04.01.35.26 (UTC) administrator
DOI10.1016/j.infsof.2020.106395
ISSN0950-5849
Citation KeyWatanabeFeCaSoCaVi:2020:ReEfSo
TitleReducing efforts of software engineering systematic literature reviews updates using text classification
Year2020
MonthDec.
Access Date2024, June 13
Type of Workjournal article
Secondary TypePRE PI
Number of Files1
Size1242 KiB
2. Context
Author1 Watanabe, Willian Massami
2 Felizardo, Katia Romero
3 Candido Júnior, Arnaldo Candido
4 Souza, Érica Ferreira de
5 Campos Neto, José Ede de
6 Vijaykumar, Nandamudi Lankalapalli
Resume Identifier1
2
3
4
5
6 8JMKD3MGP5W/3C9JHTU
Group1
2
3
4
5
6 LABAC-COCTE-INPE-MCTIC-GOV-BR
Affiliation1 Universidade Tecnológica Federal do Paraná (UTFPR)
2 Universidade Tecnológica Federal do Paraná (UTFPR)
3 Universidade Tecnológica Federal do Paraná (UTFPR)
4 Universidade Tecnológica Federal do Paraná (UTFPR)
5 Universidade Tecnológica Federal do Paraná (UTFPR)
6 Instituto Nacional de Pesquisas Espaciais (INPE)
Author e-Mail Address1 wwatanabe@utfpr.edu.br
2 katiascannavino@utfpr.edu.br
3 arnaldoc@utfpr.edu.br
4 ericasouza@utfpr.edu.br
5
6 vijay.nl@inpe.br
JournalInformation and Software Technology
Volume128
Pagese106395
Secondary MarkA2_MEDICINA_I A2_CIÊNCIA_DA_COMPUTAÇÃO B1_INTERDISCIPLINAR B2_SOCIOLOGIA
History (UTC)2020-10-02 16:01:58 :: simone -> administrator :: 2020
2022-01-04 01:35:26 :: administrator -> simone :: 2020
3. Content and structure
Is the master or a copy?is the master
Content Stagecompleted
Transferable1
Content TypeExternal Contribution
Version Typepublisher
KeywordsSystematic literature review SLR Automatic selection Review update Text classification Document classification Text categorization
AbstractContext: Systematic Literature Reviews (SLRs) are frequently used to synthesize evidence in Software Engineering (SE), however replicating and keeping SLRs up-to-date is a major challenge. The activity of studies selection in SLR is labor intensive due to the large number of studies that must be analyzed. Different approaches have been investigated to support SLR processes, such as: Visual Text Mining or Text Classification. But acquiring the initial dataset is time-consuming and labor intensive. Objective: In this work, we proposed and evaluated the use of Text Classification to support the studies selection activity of new evidences to update SLRs in SE. Method: We applied Text Classification techniques to investigate how effective and how much effort could be spared during the studies selection phase of an SLR update. Considering the SLRs update scenario, the studies analyzed in the primary SLR could be used as a classified dataset to train Supervised Machine Learning algorithms. We conducted an experiment with 8 Software Engineering SLRs. In the experiments, we investigated the use of multiple preprocessing and feature extraction tasks such as tokenization, stop words removal, word lemmatization, TF-IDF (Term-Frequency/Inverse-Document-Frequency) with Decision Tree and Support Vector Machines as classification algorithms. Furthermore, we configured the classifier activation threshold for maximizing Recall, hence reducing the number of Missed selected studies. Results: The techniques accuracies were measured and the results achieved on average a F-Score of 0.92 and 62% of exclusion rate when varying the activation threshold of the classifiers, with a 4% average number of Missed selected studies. Both the Exclusion rate and number of Missed selected studies were significantly different when compared to classifier which did not use the configuration of the activation threshold. Conclusion: The results showed the potential of the techniques in reducing the effort required of SLRs updates.
AreaCOMP
Arrangementurlib.net > BDMCI > Fonds > Produção anterior à 2021 > LABAC > Reducing efforts of...
doc Directory Contentaccess
source Directory Contentthere are no files
agreement Directory Content
agreement.html 02/10/2020 13:01 1.0 KiB 
4. Conditions of access and use
Languageen
Target Filewatanabe_reducing.pdf
User Groupsimone
Reader Groupadministrator
simone
Visibilityshown
Read Permissiondeny from all and allow from 150.163
Update Permissionnot transferred
5. Allied materials
Next Higher Units8JMKD3MGPCW/3ESGTTP
Citing Item Listsid.inpe.br/mtc-m21/2012/07.13.14.56.50 3
sid.inpe.br/bibdigital/2013/09.22.23.14 3
Host Collectionurlib.net/www/2017/11.22.19.04
6. Notes
Empty Fieldsalternatejournal archivingpolicy archivist callnumber copyholder copyright creatorhistory descriptionlevel dissemination e-mailaddress format isbn label lineage mark mirrorrepository nextedition notes number orcid parameterlist parentrepositories previousedition previouslowerunit progress project rightsholder schedulinginformation secondarydate secondarykey session shorttitle sponsor subject tertiarymark tertiarytype url
7. Description control
e-Mail (login)simone
update 


Close