Skip to content

MonthlyCommits is unsafe #96

@davisjam

Description

@davisjam

Example: I cloned the popular libuv library and tried to extract monthly commits. I ended up with 16 commits even though the project has had regular commits since 2011 (expected commits = 6 years x 12 months/year = 72 commits).

Explanation: The date being used by RepoDriller uses the author commit time rather than the committer commit time. Suppose someone forks libuv and adds commits. Years later they submit a PR with the fix. Taking a real example, suppose a commit authored in 2015 was merged in 2017. When MonthlyCommits encounters this commit, it will accept it as "at least X months before" the last commit (which was in 2017). This 2015 commit will then cause all of the remaining commits from 2017 and 2016 to be skipped.

Suggested fix: Add both author and committer time fields to ChangeSet. Use the committer time in MonthlyCommits.

Initial exploration: Changing from getAuthorIdent to getCommitterIdent in GitRepository.convertToDate seems to do the right thing: 16 -> 77 commits found, and the timestamps are well-ordered.

Thanks @ayaankazerouni for pointing out the difference between author and committer idents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions