|
|
 | | From: | John R Levine | | Subject: | Source code comparison tools | | Date: | 19 Jan 2005 22:50:04 -0500 |
|
|
 | I'm working on a project where I'm trying to understand how a program evolved. I have a bunch of snapshots of the C source code to work from.
It's easy enough to use the Unix diff program to compare the snapshots, but it's much too low level. If someone changes the name of variable A to B, diff sees that as a change every time the variable is referenced, but it's really only one change.
I can hack up some preprocessing scripts to abstract out variable names, indentation, and the like before handing the files to diff, but before I do so, I'd like to see if there's existing tools I can use.
Regards, John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies", Information Superhighwayman wanna-be, http://iecc.com/johnl, Mayor "I dropped the toothpaste", said Tom, crestfallenly.
|
|
 | | From: | David Spencer | | Subject: | Re: Source code comparison tools | | Date: | 22 Jan 2005 18:26:27 -0500 |
|
|
 | "John R Levine" writes:
>I'm working on a project where I'm trying to understand how a program >evolved. I have a bunch of snapshots of the C source code to work from.
>It's easy enough to use the Unix diff program to compare the snapshots, >but it's much too low level. If someone changes the name of variable A to >B, diff sees that as a change every time the variable is referenced, but >it's really only one change.
Did you try feeding the varous iterations to cflow?
It may filter out too much signal with the noise, but it would show changes in call flow.
-- dhs spencer@panix.com
|
|
 | | From: | Derek M Jones | | Subject: | Re: Source code comparison tools | | Date: | 22 Jan 2005 18:26:17 -0500 |
|
|
 | John,
> I'm working on a project where I'm trying to understand how a program > evolved. I have a bunch of snapshots of the C source code to work from.
Sounds like an inverse plagiarism detection problem.
I have found Simian to be very good.
http://www.redhillconsulting.com.au/products/simian/
or you could modify the following open source project
http://www.catb.org/~esr/comparator/
|
|
 | | From: | Ira Baxter | | Subject: | Re: Source code comparison tools | | Date: | 22 Jan 2005 18:29:40 -0500 |
|
|
 | "John R Levine" wrote in message > I'm working on a project where I'm trying to understand how a program > evolved. I have a bunch of snapshots of the C source code to work from. > > It's easy enough to use the Unix diff program to compare the snapshots, > but it's much too low level. If someone changes the name of variable A to > B, diff sees that as a change every time the variable is referenced, but > it's really only one change. > > I can hack up some preprocessing scripts to abstract out variable names, > indentation, and the like before handing the files to diff, but before I > do so, I'd like to see if there's existing tools I can use.
Hmm. Our CloneDR tool finds exact and near-miss duplicate code across programs. Clearly it would find (a lot!) of duplicate code across evolutionary versions. The duplicates it finds are effectively macro bodies; the parameters is proposes take into account replication of instances. In effect, this means if somebody renames variables consistently in a block of code, the detected duplicate will have one parameter for each changed variable name, e.g, exactly your "only one change" effect.
See http://www.semanticdesigns.com/Products/Clone. There's a sample Java clone detector you can download there.
-- Ira D. Baxter, Ph.D., CTO 512-250-1018 Semantic Designs, Inc. www.semdesigns.com
|
|
 | | From: | Saumya K. Debray | | Subject: | Re: Source code comparison tools | | Date: | 22 Jan 2005 18:29:58 -0500 |
|
|
 | Christian Collberg and Stephen Kobourov have done some interesting work on visualizing the evolution of software, based on information extracted from CVS:
C. Collberg, S. G. Kobourov, J. Nagra, J. Pitts, and K. Wampler, "A System for Graph-Based Visualization of the Evolution of Software," ACM Symp. on Software Visualization (SoftVis), p. 77-86, 2003.
http://www.cs.arizona.edu/~kobourov/tgrip.pdf
John R Levine wrote: >I'm working on a project where I'm trying to understand how a program >evolved. I have a bunch of snapshots of the C source code to work from.
-- Saumya Debray Dept. of Computer Science, University of Arizona, Tucson debray@cs.arizona.edu http://www.cs.arizona.edu/~debray
|
|
|