Subscribe Free

Join 2670+ others. No spamming.
I promise!

We are currently under high development. Follow us at github.


Looking for Python Tutorials?
Check these awesome tutorials



joshuaclayton/unused

356

joshuaclayton / unused

Haskell

A command line tool to identify unused code.


READ ME

Unused Build Status

A command line tool to identify unused code.

Image of Unused Output

"What kinds of projects can I used it on?"

Anything.

Yes, literally anything.

It's probably best if you have a file generated from ctags it can read from (it looks in .git, tmp, and the root directory for a tags file), but if you have another way to pipe a bunch of methods/functions/classes/modules/whatever in, that works too.

Right now, there are some special cases built in for Rails and Phoenix apps (specifically, assumptions about what's fine to only have one reference to, e.g. Controllers in Rails and Views in Phoenix), but it'll work on Rubygems, Elixir packages, or anything else.

That said, be confident the code you're removing won't break your program. Especially with projects built in Ruby, Elixir, or JavaScript, there are ways to dynamically trigger or define behavior that may be surprising. A test suite can help here, but still cannot determine every possible execution path.

Installing and Updating

Homebrew (Recommended)

You can install my formulae via Homebrew with brew tap:

brew tap joshuaclayton/formulae

Next, run:

brew install unused

This will install unused and its corresponding dependencies.

To update, run:

brew update
brew upgrade unused

Alternatively, you can install with Stack or by hand. Because it needs to compile, installation times may vary, but it's often several minutes.

Stack

If you already have Stack installed, ensure you have the latest list of packages:

stack update

Verify Stack is using at least lts-6.0 when installing by checking the global project settings in ~/.stack/global-project/stack.yaml.

Once that is complete, run:

stack install unused

This will install unused in the appropriate directory for Stack; you'll want to ensure your $PATH reflects this.

Installing by hand

This project is written in Haskell and uses Stack.

Once you have these tools installed and the project cloned locally:

stack setup
stack install

This will generate a binary in $HOME/.local/bin; ensure this directory is in your $PATH.

Install via Docker

Once Docker is installed, create a binary within your $PATH to run the image:

#!/usr/bin/env bash

docker run --rm -it -v $(pwd):/code joshuaclayton/unused unused $@

Note that, because Unused will be running inside of a virtual machine, it will take longer to generate output than were you to install via previously-mentioned methods.

Using Unused

unused attempts to read from common tags file locations (.git/tags, tags, and tmp/tags).

In an application where the tags file exists, run:

unused

If you don't have a tags file, you can generate one by running:

git ls-files | xargs ctags

If you want to specify a custom tags file, or load tokens from somewhere else, run:

cat .custom/tags | unused --stdin

To view more usage options, run:

unused --help

Troubleshooting

Ctags (and a corresponding workflow) isn't configured

Exuberant Ctags (or another tool that will generate a tags file, like hasktags for Haskell projects) is required to use unused correctly; however, the version of ctags that ships with OS X (/usr/bin/ctags) is an older version won't work with many languages (that BSD version of ctags says it "makes a tags file for ex(1) from the specified C, Pascal, Fortran, YACC, lex, and lisp sources.")

Installation via Homebrew includes the ctags dependency. You can also run brew install ctags by hand. If you're not on OS X, use your favorite package manager and refer to the Exuberant Ctags site for download instructions.

Ctags manual run

If you're using ctags to generate a tags file prior to running unused and don't have a workflow around automatically generating a tags file, run:

git ls-files | xargs ctags -f tmp/tags

This will take your .gitignore into account and write the tags file to tmp/tags. Be sure to write this to a location that's ignored by git.

While this process allows a developer to get started, it requires remembering to run this command before running unused. Let's explore how to automate this process.

Ctags automatic runs via git hooks

With ctags installed, you'll likely want to configure your workflow such that your tags file gets updated periodically without any action on your part. I recommend following the instructions outlined by Tim Pope on this matter, which discusses a workflow coupled to git for managing the tags file. It includes shell scripting that may not look "effortless"; however, the fact this is automated helps to ensure unused is running against new versions of the code as you (and other teammates, if you have any) are committing.

As he suggests, you'll want to run git init into the directories you want this hook, and to manually run the hook:

git ctags

unused is configured to look for a tags file in three different directories, including .git/ as the article suggests, so no further configuration will be necessary with unused.

"Calculating cache fingerprint" takes a long time

unused attempts to be intelligent at understanding if your codebase has changed before running analysis (since it can be time-consuming on large codebases). To do so, it calculates a "fingerprint" of the entire directory by using md5 (or md5sum), along with find and your .gitignore file.

If you're checking in artifacts (e.g. node_modules/, dist/, tmp/, or similar), unused will likely take significantly longer to calculate the fingerprint.

Per the --help documentation, you can disable caching with the -C flag:

$ unused -C

"No results found" when expecting results

If you're expecting to see results but unused doesn't find anything, verify that any artifacts unused uses (e.g. the tags file, wherever it's located) or generates (e.g. in PROJECT_ROOT/tmp/unused) is .gitignored.

What might be happening is, because unused searches for tokens with ag (which honors .gitignore), it's running into checked-in versions of the tokens from other files, resulting in duplicate occurrences that aren't representative of the actual codebase. The most obvious might be the tags file itself, although if you're using an IDE that runs any sort of analysis and that's getting checked in somehow, that may cause it too.

One final piece to check is the number of tokens in the tags file itself; if ctags is misconfigured and only a handful of tokens are being analyzed, they all may have low removal likelihood and not display in the default results (high-likelihood only).

Analysis takes a long time due to a large number of terms found

In my experience, projects under 100,000LOC should have at most around 8,000 unique tokens found. This obviously depends on how you structure your classes/modules/functions, but it'll likely be close.

If you're seeing more than 15,000 terms matched (I've seen upwards of 70,000), this is very likely due to misconfiguration of ctags where it includes some amount of build artifacts. In Ruby, this might be a RAILS_ROOT/vendor directory, or if you're using NPM, APP_ROOT/node_modules or APP_ROOT/bower_components.

When configuring ctags, be sure to include your --exclude directives; you can find an example here.

Custom Configuration

The first time you use unused, you might see a handful of false positives. unused will look in two additional locations in an attempt to load additional custom configuration to help improve this.

Configuration format

# Language or framework name
#   e.g. Rails, Ruby, Go, Play
- name: Framework or language
  # Collection of matches allowed to have one occurrence
  autoLowLikelihood:
    # Low likelihood match name
    - name: ActiveModel::Serializer
      # Flag to capture only capitalized names
      #   e.g. would match `ApplicationController`, not `with_comments`
      classOrModule: true

      # Matcher for `.*Serializer$`
      #   e.g. `UserSerializer`, `ProjectSerializer`
      termEndsWith: Serializer

      # Matcher for `^with_.*`
      #   e.g. `with_comments`, `with_previous_payments`
      termStartsWith: with_

      # Matcher for `^ApplicationController$`
      termEquals: ApplicationController

      # Matcher for `.*_factory.ex`
      #   e.g. `lib/appname/user_factory.ex`, `lib/appname/project_factory.ex`
      pathEndsWith: _factory.ex

      # Matcher for `^app/policies.*`
      #   e.g. `app/policies/user_policy.rb`, `app/policies/project_policy.rb`
      pathStartsWith: app/policies

      # list of termEquals
      # Matcher allowing any exact match from a list
      allowedTerms:
      - index?
      - edit?
      - create?

~/.unused.yml

The first location is ~/.unused.yml. This should hold widely-used configuration roughly applicable across projects. Here's an example of what might be present:

- name: Rails
  autoLowLikelihood:
    - name: ActiveModel::Serializer
      termEndsWith: Serializer
      classOrModule: true
    - name: Pundit
      termEndsWith: Policy
      classOrModule: true
      pathEndsWith: .rb
    - name: Pundit Helpers
      allowedTerms:
        - Scope
        - index?
        - new?
        - create?
        - show?
        - edit?
        - destroy?
        - resolve
    - name: JSONAPI::Resources
      termEndsWith: Resource
      classOrModule: true
      pathStartsWith: app/resources
    - name: JSONAPI::Resources Helpers
      allowedTerms:
      - updatable_fields
      pathStartsWith: app/resources

I tend to work on different APIs, and the two libraries I most commonly use have a fairly similar pattern when it comes to class naming. They both also use that naming structure to identify serializers automatically, meaning they very well may only be referenced once in the entire application (when they're initially defined).

Similarly, with Pundit, an authorization library, naming conventions often mean only one reference to the class name.

This is a file that might grow, but is focused on widely-used patterns across codebases. You might even want to check it into your dotfiles.

APP_ROOT/.unused.yml

The second location is APP_ROOT/.unused.yml. This is where any project-specific settings might live. If you're working on a library before extracting to a gem or package, you might have this configuration take that into account.

Validation

unused will attempt to parse both of these files, if it finds them. If either is invalid either due to missing or mistyped keys, an error will be displayed.

Requirements

Unused leverages Ag to analyze the codebase; as such, you'll need to have ag available in your $PATH. This is set as an explicit dependency in Homebrew.

Alternatively, if you'd like to use RipGrep, you can do so with the --search rg flag. Be sure to have RipGrep installed first.

Testing

To run the test suite, run:

stack test

License

Copyright 2016-2017 Josh Clayton. See the LICENSE.