DSpace Repository

How different are different diff algorithms in Git?

Show simple item record

dc.contributor.author Nugroho, Yusuf Sulistyo
dc.contributor.author Hata, Hideaki
dc.contributor.author Matsumoto, Kenichi
dc.date.accessioned 2020-09-18T10:31:06Z
dc.date.available 2020-09-18T10:31:06Z
dc.date.issued 2019-09-11
dc.identifier.uri http://hdl.handle.net/10061/14057
dc.description.abstract Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm Myers to the advanced Histogram algorithm. From our systematic mapping, we identified three popular applications of diff in recent studies. On the impact on code churn metrics in 14 Java projects, we obtained different values in 1.7% to 8.2% commits based on the different diff algorithms. Regarding bug-introducing change identification, we found 6.0% and 13.3% in the identified bug-fix commits had different results of bug-introducing changes from 10 Java projects. For patch application, we found that the Histogram is more suitable than Myers for providing the changes of code, from our manual analysis. Thus, we strongly recommend using the Histogram algorithm when mining Git repositories to consider differences in source code. ja_JP
dc.language.iso en ja_JP
dc.publisher Springer Nature ja_JP
dc.relation.isreplacedby https://link.springer.com/article/10.1007%2Fs10664-019-09772-z ja_JP
dc.rights © The Author(s) 2019 ja_JP
dc.subject Code changes ja_JP
dc.subject Diff ja_JP
dc.subject Histogram algorithm ja_JP
dc.subject Mining repositories ja_JP
dc.title How different are different diff algorithms in Git? ja_JP
dc.type.nii Journal Article ja_JP
dc.contributor.transcription ハタ, ヒデアキ
dc.contributor.transcription マツモト, ケンイチ
dc.contributor.alternative 畑, 秀明
dc.contributor.alternative 松本, 健一
dc.textversion none ja_JP
dc.identifier.eissn 1573-7616
dc.identifier.jtitle Empirical Software Engineering ja_JP
dc.identifier.spage 790 ja_JP
dc.identifier.epage 823 ja_JP
dc.relation.doi 10.1007/s10664-019-09772-z ja_JP
dc.identifier.NAIST-ID 86630415 ja_JP
dc.identifier.NAIST-ID 73299364 ja_JP
dc.identifier.NAIST-ID 73292310 ja_JP

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account