Compare commits

..

1 commit

Author SHA1 Message Date
Geoff Greer
f90bf93036 Start of pcre2 stuff. Doesn't even compile yet. 2016-12-03 12:44:20 -08:00
41 changed files with 298 additions and 1400 deletions

1
.gitignore vendored
View file

@ -1,5 +1,4 @@
*.dSYM
*.gcda
*.o
*.plist
.deps

View file

@ -1,14 +1,9 @@
language: c
dist: xenial
sudo: false
branches:
only:
- master
- ppc64le
arch:
- amd64
- ppc64le
compiler:
- clang
@ -27,12 +22,12 @@ addons:
env:
global:
- LLVM_VERSION=6.0.1
- LLVM_VERSION=3.8.0
- LLVM_PATH=$HOME/clang+llvm
- CLANG_FORMAT=$LLVM_PATH/bin/clang-format
before_install:
- wget http://llvm.org/releases/$LLVM_VERSION/clang+llvm-$LLVM_VERSION-x86_64-linux-gnu-ubuntu-16.04.tar.xz -O $LLVM_PATH.tar.xz
- wget http://llvm.org/releases/$LLVM_VERSION/clang+llvm-$LLVM_VERSION-x86_64-linux-gnu-ubuntu-14.04.tar.xz -O $LLVM_PATH.tar.xz
- mkdir $LLVM_PATH
- tar xf $LLVM_PATH.tar.xz -C $LLVM_PATH --strip-components=1
- export PATH=$HOME/.local/bin:$PATH
@ -42,9 +37,3 @@ install:
script:
- ./build.sh && make test
notifications:
irc: 'chat.freenode.net#ag'
on_success: change
on_failure: always
use_notice: true

View file

@ -14,13 +14,3 @@ The test suite uses [Cram](https://bitheap.org/cram/). You'll need to build ag
first, and then you can run the suite from the root of the repository :
make test
### Adding filetypes
Ag can search files which belong to a certain class for example `ag --html test`
searches all files with the extension defined in [lang.c](src/lang.c).
If you want to add a new file 'class' to ag please modify [lang.c](src/lang.c) and [list_file_types.t](tests/list_file_types.t).
`lang.c` adds the functionality and `list_file_types.t` adds the test case.
Without adding a test case the test __will__ fail.

View file

@ -1,8 +1,8 @@
ACLOCAL_AMFLAGS = ${ACLOCAL_FLAGS}
bin_PROGRAMS = ag
ag_SOURCES = src/ignore.c src/ignore.h src/log.c src/log.h src/options.c src/options.h src/print.c src/print_w32.c src/print.h src/scandir.c src/scandir.h src/search.c src/search.h src/lang.c src/lang.h src/util.c src/util.h src/decompress.c src/decompress.h src/uthash.h src/main.c src/zfile.c
ag_LDADD = ${PCRE_LIBS} ${LZMA_LIBS} ${ZLIB_LIBS} $(PTHREAD_LIBS)
ag_SOURCES = src/ignore.c src/ignore.h src/log.c src/log.h src/options.c src/options.h src/print.c src/print_w32.c src/print.h src/scandir.c src/scandir.h src/search.c src/search.h src/lang.c src/lang.h src/util.c src/util.h src/decompress.c src/decompress.h src/uthash.h src/main.c
ag_LDADD = ${PCRE2_LIBS} ${LZMA_LIBS} ${ZLIB_LIBS} $(PTHREAD_LIBS)
dist_man_MANS = doc/ag.1
@ -13,9 +13,6 @@ dist_zshcomp_DATA = _the_silver_searcher
EXTRA_DIST = Makefile.w32 LICENSE NOTICE the_silver_searcher.spec README.md
all:
@$(MAKE) ag -r
test: ag
cram -v tests/*.t
if HAS_CLANG_FORMAT
@ -30,4 +27,4 @@ test_big: ag
test_fail: ag
cram -v tests/fail/*.t
.PHONY : all clean test test_big test_fail
.PHONY : all test clean

View file

@ -6,7 +6,7 @@ A code searching tool similar to `ack`, with a focus on speed.
[![Floobits Status](https://floobits.com/ggreer/ag.svg)](https://floobits.com/ggreer/ag/redirect)
[![#ag on Freenode](https://img.shields.io/badge/Freenode-%23ag-brightgreen.svg)](https://webchat.freenode.net/?channels=ag)
[![#ag on Freenode](http://img.shields.io/Freenode/%23ag.png)](https://webchat.freenode.net/?channels=ag)
Do you know C? Want to improve ag? [I invite you to pair with me](http://geoff.greer.fm/2014/10/13/help-me-get-to-ag-10/).
@ -34,7 +34,7 @@ There are also [graphs of performance across releases](http://geoff.greer.fm/ag/
* Files are `mmap()`ed instead of read into a buffer.
* Literal string searching uses [Boyer-Moore strstr](https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm).
* Regex searching uses [PCRE's JIT compiler](http://sljit.sourceforge.net/pcre.html) (if Ag is built with PCRE >=8.21).
* Ag calls `pcre_study()` before executing the same regex on every file.
* Ag calls `pcre2_study()` before executing the same regex on every file.
* Instead of calling `fnmatch()` on every pattern in your ignore files, non-regex patterns are loaded into arrays and binary searched.
I've written several blog posts showing how I've improved performance. These include how I [added pthreads](http://geoff.greer.fm/2012/09/07/the-silver-searcher-adding-pthreads/), [wrote my own `scandir()`](http://geoff.greer.fm/2012/09/03/profiling-ag-writing-my-own-scandir/), [benchmarked every revision to find performance regressions](http://geoff.greer.fm/2012/08/25/the-silver-searcher-benchmarking-revisions/), and profiled with [gprof](http://geoff.greer.fm/2012/02/08/profiling-with-gprof/) and [Valgrind](http://geoff.greer.fm/2012/01/23/making-programs-faster-profiling/).
@ -42,7 +42,7 @@ I've written several blog posts showing how I've improved performance. These inc
## Installing
### macOS
### OS X
brew install the_silver_searcher
@ -67,7 +67,7 @@ or
yum install epel-release.noarch the_silver_searcher
* Gentoo
emerge -a sys-apps/the_silver_searcher
emerge the_silver_searcher
* Arch
pacman -S the_silver_searcher
@ -76,20 +76,6 @@ or
sbopkg -i the_silver_searcher
* openSUSE
zypper install the_silver_searcher
* CentOS
yum install the_silver_searcher
* NixOS/Nix/Nixpkgs
nix-env -iA silver-searcher
* SUSE Linux Enterprise: Follow [these simple instructions](https://software.opensuse.org/download.html?project=utilities&package=the_silver_searcher).
### BSD
@ -100,67 +86,37 @@ or
pkg_add the_silver_searcher
### Windows
### Cygwin
* Win32/64
Unofficial daily builds are [available](https://github.com/k-takata/the_silver_searcher-win32).
* winget
winget install "The Silver Searcher"
Notes:
- This installs a [release](https://github.com/JFLarvoire/the_silver_searcher/releases) of ag.exe optimized for Windows.
- winget is intended to become the default package manager client for Windows.
As of June 2020, it's still in beta, and can be installed using instructions [there](https://github.com/microsoft/winget-cli).
- The setup script in the Ag's winget package installs ag.exe in the first directory that matches one of these criteria:
1. Over a previous instance of ag.exe *from the same [origin](https://github.com/JFLarvoire/the_silver_searcher)* found in the PATH
2. In the directory defined in environment variable bindir_%PROCESSOR_ARCHITECTURE%
3. In the directory defined in environment variable bindir
4. In the directory defined in environment variable windir
* Chocolatey
choco install ag
* MSYS2
pacman -S mingw-w64-{i686,x86_64}-ag
* Cygwin
Run the relevant [`setup-*.exe`](https://cygwin.com/install.html), and select "the\_silver\_searcher" in the "Utils" category.
Run the relevant [`setup-*.exe`](https://cygwin.com/install.html), and select "the\_silver\_searcher" in the "Utils" category.
## Building from source
### Building master
1. Install dependencies (Automake, pkg-config, PCRE, LZMA):
* macOS:
1. Install dependencies (Automake, pkg-config, PCRE2, LZMA):
* OS X:
brew install automake pkg-config pcre xz
brew install automake pkg-config pcre2 xz
or
port install automake pkgconfig pcre xz
port install automake pkgconfig pcre2 xz
* Ubuntu/Debian:
apt-get install -y automake pkg-config libpcre3-dev zlib1g-dev liblzma-dev
apt-get install -y automake pkg-config libpcre2-dev zlib1g-dev liblzma-dev
* Fedora:
yum -y install pkgconfig automake gcc zlib-devel pcre-devel xz-devel
yum -y install pkgconfig automake gcc zlib-devel pcre2-devel xz-devel
* CentOS:
yum -y groupinstall "Development Tools"
yum -y install pcre-devel xz-devel zlib-devel
* openSUSE:
zypper source-install --build-deps-only the_silver_searcher
yum -y install pcre2-devel xz-devel
* Windows: It's complicated. See [this wiki page](https://github.com/ggreer/the_silver_searcher/wiki/Windows).
2. Run the build script (which just runs aclocal, automake, etc):
./build.sh
On Windows (inside an msys/MinGW shell):
On Windows (inside an msys/MinGW shell):
make -f Makefile.w32
3. Make install:
@ -185,7 +141,7 @@ You may need to use `sudo` or run as root for the make install.
### Vim
You can use Ag with [ack.vim](https://github.com/mileszs/ack.vim) by adding the following line to your `.vimrc`:
You can use Ag with [ack.vim][] by adding the following line to your `.vimrc`:
let g:ackprg = 'ag --nogroup --nocolor --column'
@ -208,10 +164,9 @@ TextMate users can use Ag with [my fork](https://github.com/ggreer/AckMate) of t
## Other stuff you might like
* [Ack](https://github.com/petdance/ack3) - Better than grep. Without Ack, Ag would not exist.
* [Ack](https://github.com/petdance/ack2) - Better than grep. Without Ack, Ag would not exist.
* [ack.vim](https://github.com/mileszs/ack.vim)
* [Exuberant Ctags](http://ctags.sourceforge.net/) - Faster than Ag, but it builds an index beforehand. Good for *really* big codebases.
* [Git-grep](http://git-scm.com/docs/git-grep) - As fast as Ag but only works on git repos.
* [fzf](https://github.com/junegunn/fzf) - A command-line fuzzy finder
* [ripgrep](https://github.com/BurntSushi/ripgrep)
* [Sack](https://github.com/sampson-chen/sack) - A utility that wraps Ack and Ag. It removes a lot of repetition from searching and opening matching files.

View file

@ -67,7 +67,7 @@ _ag() {
--parallel
--passthrough
--passthru
--path-to-ignore
--path-to-agignore
--print-long-lines
--print0
--recurse
@ -106,7 +106,7 @@ _ag() {
--ignore-dir) # directory completion
_filedir -d
return 0;;
--path-to-ignore) # file completion
--path-to-agignore) # file completion
_filedir
return 0;;
--pager) # command completion

View file

@ -1,21 +0,0 @@
#!/bin/sh
set -e
cd "$(dirname "$0")"
AC_SEARCH_OPTS=""
# For those of us with pkg-config and other tools in /usr/local
PATH=$PATH:/usr/local/bin
# This is to make life easier for people who installed pkg-config in /usr/local
# but have autoconf/make/etc in /usr/. AKA most mac users
if [ -d "/usr/local/share/aclocal" ]
then
AC_SEARCH_OPTS="-I /usr/local/share/aclocal"
fi
# shellcheck disable=2086
aclocal $AC_SEARCH_OPTS
autoconf
autoheader
automake --add-missing

View file

@ -1,8 +1,22 @@
#!/bin/sh
set -e
cd "$(dirname "$0")"
cd "$(dirname "$0")" || exit 1
./autogen.sh
./configure "$@"
AC_SEARCH_OPTS=""
# For those of us with pkg-config and other tools in /usr/local
PATH=$PATH:/usr/local/bin
# This is to make life easier for people who installed pkg-config in /usr/local
# but have autoconf/make/etc in /usr/. AKA most mac users
if [ -d "/usr/local/share/aclocal" ]
then
AC_SEARCH_OPTS="-I /usr/local/share/aclocal"
fi
# shellcheck disable=2086
aclocal $AC_SEARCH_OPTS && \
autoconf && \
autoheader && \
automake --add-missing && \
./configure "$@" && \
make -j4

View file

@ -1,6 +1,6 @@
AC_INIT(
[the_silver_searcher],
[2.2.0],
[1.0.1],
[https://github.com/ggreer/the_silver_searcher/issues],
[the_silver_searcher],
[https://github.com/ggreer/the_silver_searcher])
@ -10,13 +10,13 @@ AM_INIT_AUTOMAKE([no-define foreign subdir-objects])
AC_PROG_CC
AM_PROG_CC_C_O
AC_PREREQ([2.59])
AC_PROG_GREP
m4_ifdef(
[AM_SILENT_RULES],
[AM_SILENT_RULES([yes])])
PKG_CHECK_MODULES([PCRE], [libpcre])
PKG_CHECK_MODULES([PCRE2], [libpcre2-8])
AC_DEFINE([PCRE2_CODE_UNIT_WIDTH], [8], [Use utf8])
m4_include([m4/ax_pthread.m4])
AX_PTHREAD(
@ -25,12 +25,7 @@ AX_PTHREAD(
)
# Run CFLAGS="-pg" ./configure if you want debug symbols
if ! echo "$CFLAGS" | "$GREP" '\(^\|[[[:space:]]]\)-O' > /dev/null; then
CFLAGS="$CFLAGS -O2"
fi
CFLAGS="$CFLAGS $PTHREAD_CFLAGS $PCRE_CFLAGS -Wall -Wextra -Wformat=2 -Wno-format-nonliteral -Wshadow"
CFLAGS="$CFLAGS -Wpointer-arith -Wcast-qual -Wmissing-prototypes -Wno-missing-braces -std=gnu89 -D_GNU_SOURCE"
CFLAGS="$CFLAGS $PTHREAD_CFLAGS $PCRE2_CFLAGS -Wall -Wextra -Wformat=2 -Wno-format-nonliteral -Wshadow -Wpointer-arith -Wcast-qual -Wmissing-prototypes -Wno-missing-braces -std=gnu89 -D_GNU_SOURCE -O2"
LDFLAGS="$LDFLAGS"
case $host in
@ -56,15 +51,14 @@ AS_IF([test "x$enable_lzma" != "xno"], [
PKG_CHECK_MODULES([LZMA], [liblzma])
])
AC_CHECK_DECL([PCRE_CONFIG_JIT], [AC_DEFINE([USE_PCRE_JIT], [], [Use PCRE JIT])], [], [#include <pcre.h>])
AC_CHECK_DECL([PCRE2_CONFIG_JIT], [AC_DEFINE([USE_PCRE2_JIT], [], [Use PCRE2 JIT])], [], [#include <pcre2.h>])
AC_CHECK_DECL([CPU_ZERO, CPU_SET], [AC_DEFINE([USE_CPU_SET], [], [Use CPU_SET macros])] , [], [#include <sched.h>])
AC_CHECK_HEADERS([sys/cpuset.h err.h])
AC_CHECK_MEMBER([struct dirent.d_type], [AC_DEFINE([HAVE_DIRENT_DTYPE], [], [Have dirent struct member d_type])], [], [[#include <dirent.h>]])
AC_CHECK_MEMBER([struct dirent.d_namlen], [AC_DEFINE([HAVE_DIRENT_DNAMLEN], [], [Have dirent struct member d_namlen])], [], [[#include <dirent.h>]])
AC_CHECK_FUNCS(fgetln fopencookie getline realpath strlcpy strndup vasprintf madvise posix_fadvise pthread_setaffinity_np pledge)
AC_CHECK_FUNCS(fgetln getline realpath strlcpy strndup vasprintf madvise posix_fadvise pthread_setaffinity_np pledge)
AC_CONFIG_FILES([Makefile the_silver_searcher.spec])
AC_CONFIG_HEADERS([src/config.h])

View file

@ -1,7 +1,7 @@
.\" generated with Ronn/v0.7.3
.\" http://github.com/rtomayko/ronn/tree/0.7.3
.
.TH "AG" "1" "December 2016" "" ""
.TH "AG" "1" "November 2016" "" ""
.
.SH "NAME"
\fBag\fR \- The Silver Searcher\. Like ack, but faster\.
@ -136,7 +136,7 @@ Skip the rest of a file after NUM matches\. Default is 0, which never skips\.
.
.TP
\fB\-\-[no]mmap\fR
Toggle use of memory\-mapped I/O\. Defaults to true on platforms where \fBmmap()\fR is faster than \fBread()\fR\. (All but macOS\.)
Toggle use of memory\-mapped I/O\. Defaults to true\.
.
.TP
\fB\-\-[no]multiline\fR

View file

@ -109,8 +109,7 @@ Recursively search for PATTERN in PATH. Like grep or ack, but faster.
Skip the rest of a file after NUM matches. Default is 0, which never skips.
* `--[no]mmap`:
Toggle use of memory-mapped I/O. Defaults to true on platforms where
`mmap()` is faster than `read()`. (All but macOS.)
Toggle use of memory-mapped I/O. Defaults to true.
* `--[no]multiline`:
Match regexes across newlines. Enabled by default.
@ -207,9 +206,6 @@ Recursively search for PATTERN in PATH. Like grep or ack, but faster.
* `--workers NUM`:
Use NUM worker threads. Default is the number of CPU cores, with a max of 8.
* `-W --width NUM`:
Truncate match lines after NUM characters.
* `-z --search-zip`:
Search contents of compressed files. Currently, gz and xz are supported.
This option requires that ag is built with lzma and zlib.

10
pgo.sh
View file

@ -1,10 +0,0 @@
#!/bin/sh
set -e
cd "$(dirname "$0")"
make clean
./build.sh CFLAGS="$CFLAGS -fprofile-generate"
./ag example ..
make clean
./build.sh CFLAGS="$CFLAGS -fprofile-correction -fprofile-use"

View file

@ -1,196 +0,0 @@
#!/bin/bash
# Copyright 2016 Allen Wild
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
AVAILABLE_SANITIZERS=(
address
thread
undefined
valgrind
)
DEFAULT_SANITIZERS=(
address
thread
undefined
)
usage() {
cat <<EOF
Usage: $0 [-h] [valgrind | [SANITIZERS ...]]
This script recompiles ag using -fsanitize=<SANITIZER> and then runs the test suite.
Memory leaks or other errors will be printed in ag's output, thus failing the test.
Available LLVM sanitizers are: ${AVAILABLE_SANITIZERS[*]}
The compile-time sanitizers are supported in clang/llvm >= 3.1 and gcc >= 4.8
for x86_64 Linux only. clang is preferred and will be used, if available.
For function names and line numbers in error output traces, llvm-symbolizer needs
to be available in PATH or set through ASAN_SYMBOLIZER_PATH.
If 'valgrind' is passed as the sanitizer, then ag will be run through valgrind
without recompiling. If $(dirname $0)/ag doesn't exist, then it will be built.
WARNING: This script will run "make distclean" and "./configure" to recompile ag
once per sanitizer (except for valgrind). If you need to pass additional
options to ./configure, put them in the CONFIGOPTS environment variable.
EOF
}
vrun() {
echo "Running: $*"
"$@"
}
die() {
echo "Fatal: $*"
exit 1
}
valid_sanitizer() {
for san in "${AVAILABLE_SANITIZERS[@]}"; do
if [[ "$1" == "$san" ]]; then
return 0
fi
done
return 1
}
run_sanitizer() {
sanitizer=$1
if [[ "$sanitizer" == "valgrind" ]]; then
run_valgrind
return $?
fi
echo -e "\nCompiling for sanitizer '$sanitizer'"
[[ -f Makefile ]] && vrun make distclean
vrun ./configure $CONFIGOPTS CC=$SANITIZE_CC \
CFLAGS="-g -O0 -fsanitize=$sanitizer $EXTRA_CFLAGS"
if [[ $? != 0 ]]; then
echo "ERROR: Failed to configure. Try setting CONFIGOPTS?"
return 1
fi
vrun make
if [[ $? != 0 ]]; then
echo "ERROR: failed to build"
return 1
fi
echo "Testing with sanitizer '$sanitizer'"
vrun make test
if [[ $? != 0 ]]; then
echo "Tests for sanitizer '$sanitizer' FAIL!"
echo "Check the above output for failure information"
return 2
else
echo "Tests for sanitizer '$sanitizer' PASS!"
return 0
fi
}
run_valgrind() {
echo "Compiling ag normally for use with valgrind"
[[ -f Makefile ]] && vrun make distclean
vrun ./configure $CONFIGOPTS
if [[ $? != 0 ]]; then
echo "ERROR: Failed to configure. Try setting CONFIGOPTS?"
return 1
fi
vrun make
if [[ $? != 0 ]]; then
echo "ERROR: failed to build"
return 1
fi
echo "Running: AGPROG=\"valgrind -q $PWD/ag\" make test"
AGPROG="valgrind -q $PWD/ag" make test
if [[ $? != 0 ]]; then
echo "Valgrind tests FAIL!"
return 1
else
echo "Valgrind tests PASS!"
return 0
fi
}
#### MAIN ####
run_sanitizers=()
for opt in "$@"; do
if [[ "$opt" == -* ]]; then
case opt in
-h|--help)
usage
exit 0
;;
*)
echo "Unknown option: '$opt'"
usage
exit 1
;;
esac
else
if valid_sanitizer "$opt"; then
run_sanitizers+=("$opt")
else
echo "Invalid Sanitizer: '$opt'"
usage
exit 1
fi
fi
done
if [[ ${#run_sanitizers[@]} == 0 ]]; then
run_sanitizers=(${DEFAULT_SANITIZERS[@]})
fi
if [[ -n $CC ]]; then
echo "Using CC=$CC"
SANITIZE_CC="$CC"
elif which clang &>/dev/null; then
SANITIZE_CC="clang"
else
echo "Warning: CC unset and clang not found"
fi
if [[ -n $CFLAGS ]]; then
EXTRA_CFLAGS="$CFLAGS"
unset CFLAGS
fi
if [[ ! -e ./configure ]]; then
echo "Warning: ./configure not found. Running autogen"
vrun ./autogen.sh || die "autogen.sh failed"
fi
echo "Running sanitizers: ${run_sanitizers[*]}"
failedsan=()
for san in "${run_sanitizers[@]}"; do
run_sanitizer $san
if [[ $? != 0 ]]; then
failedsan+=($san)
fi
done
if [[ ${#failedsan[@]} == 0 ]]; then
echo "All sanitizers PASSED"
exit 0
else
echo "The following sanitizers FAILED: ${failedsan[*]}"
exit ${#failedsan[@]}
fi

View file

@ -1,8 +1,6 @@
#ifndef DECOMPRESS_H
#define DECOMPRESS_H
#include <stdio.h>
#include "config.h"
#include "log.h"
#include "options.h"
@ -18,9 +16,4 @@ typedef enum {
ag_compression_type is_zipped(const void *buf, const int buf_len);
void *decompress(const ag_compression_type zip_type, const void *buf, const int buf_len, const char *dir_full_path, int *new_buf_len);
#if HAVE_FOPENCOOKIE
FILE *decompress_open(int fd, const char *mode, ag_compression_type ctype);
#endif
#endif

View file

@ -20,8 +20,6 @@
const int fnmatch_flags = FNM_PATHNAME;
#endif
ignores *root_ignores;
/* TODO: build a huge-ass list of files we want to ignore by default (build cache stuff, pyc files, etc) */
const char *evil_hardcoded_ignore_files[] = {
@ -32,6 +30,8 @@ const char *evil_hardcoded_ignore_files[] = {
/* Warning: changing the first two strings will break skip_vcs_ignores. */
const char *ignore_pattern_files[] = {
/* Warning: .agignore will one day be removed in favor of .ignore */
".agignore",
".ignore",
".gitignore",
".git/info/exclude",
@ -53,8 +53,6 @@ ignores *init_ignore(ignores *parent, const char *dirname, const size_t dirname_
ig->slash_names_len = 0;
ig->regexes = NULL;
ig->regexes_len = 0;
ig->invert_regexes = NULL;
ig->invert_regexes_len = 0;
ig->slash_regexes = NULL;
ig->slash_regexes_len = 0;
ig->dirname = dirname;
@ -88,7 +86,6 @@ void cleanup_ignore(ignores *ig) {
free_strings(ig->names, ig->names_len);
free_strings(ig->slash_names, ig->slash_names_len);
free_strings(ig->regexes, ig->regexes_len);
free_strings(ig->invert_regexes, ig->invert_regexes_len);
free_strings(ig->slash_regexes, ig->slash_regexes_len);
if (ig->abs_path) {
free(ig->abs_path);
@ -120,21 +117,15 @@ void add_ignore_pattern(ignores *ig, const char *pattern) {
char ***patterns_p;
size_t *patterns_len;
if (is_fnmatch(pattern)) {
if (pattern[0] == '*' && pattern[1] == '.' && strchr(pattern + 2, '.') && !is_fnmatch(pattern + 2)) {
if (pattern[0] == '*' && pattern[1] == '.' && !(is_fnmatch(pattern + 2))) {
patterns_p = &(ig->extensions);
patterns_len = &(ig->extensions_len);
pattern += 2;
pattern_len -= 2;
} else if (pattern[0] == '/') {
patterns_p = &(ig->slash_regexes);
patterns_len = &(ig->slash_regexes_len);
pattern++;
pattern_len--;
} else if (pattern[0] == '!') {
patterns_p = &(ig->invert_regexes);
patterns_len = &(ig->invert_regexes_len);
pattern++;
pattern_len--;
} else {
patterns_p = &(ig->regexes);
patterns_len = &(ig->regexes_len);
@ -202,13 +193,12 @@ static int ackmate_dir_match(const char *dir_name) {
return 0;
}
/* we just care about the match, not where the matches are */
return pcre_exec(opts.ackmate_dir_filter, NULL, dir_name, strlen(dir_name), 0, 0, NULL, 0);
return pcre2_match(opts.ackmate_dir_filter, dir_name, strlen(dir_name), 0, 0, NULL, NULL);
}
/* This is the hottest code in Ag. 10-15% of all execution time is spent here */
static int path_ignore_search(const ignores *ig, const char *path, const char *filename) {
char *temp;
int temp_start_pos;
size_t i;
int match_pos;
@ -219,12 +209,9 @@ static int path_ignore_search(const ignores *ig, const char *path, const char *f
}
ag_asprintf(&temp, "%s/%s", path[0] == '.' ? path + 1 : path, filename);
//ig->abs_path has its leading slash stripped, so we have to strip the leading slash
//of temp as well
temp_start_pos = (temp[0] == '/') ? 1 : 0;
if (strncmp(temp + temp_start_pos, ig->abs_path, ig->abs_path_len) == 0) {
char *slash_filename = temp + temp_start_pos + ig->abs_path_len;
if (strncmp(temp, ig->abs_path, ig->abs_path_len) == 0) {
char *slash_filename = temp + ig->abs_path_len;
if (slash_filename[0] == '/') {
slash_filename++;
}
@ -265,15 +252,6 @@ static int path_ignore_search(const ignores *ig, const char *path, const char *f
}
}
for (i = 0; i < ig->invert_regexes_len; i++) {
if (fnmatch(ig->invert_regexes[i], filename, fnmatch_flags) == 0) {
log_debug("file %s not ignored because name matches regex pattern !%s", filename, ig->invert_regexes[i]);
free(temp);
return 0;
}
log_debug("pattern !%s doesn't match file %s", ig->invert_regexes[i], filename);
}
for (i = 0; i < ig->regexes_len; i++) {
if (fnmatch(ig->regexes[i], filename, fnmatch_flags) == 0) {
log_debug("file %s ignored because name matches regex pattern %s", filename, ig->regexes[i]);
@ -317,7 +295,15 @@ int filename_filter(const char *path, const struct dirent *dir, void *baton) {
}
scandir_baton_t *scandir_baton = (scandir_baton_t *)baton;
const char *path_start = scandir_baton->path_start;
const char *base_path = scandir_baton->base_path;
const size_t base_path_len = scandir_baton->base_path_len;
const char *path_start = path;
for (i = 0; base_path[i] == path[i] && i < base_path_len; i++) {
/* base_path always ends with "/\0" while path doesn't, so this is safe */
path_start = path + i + 2;
}
log_debug("path_start %s filename %s", path_start, filename);
const char *extension = strchr(filename, '.');
if (extension) {

View file

@ -15,8 +15,6 @@ struct ignores {
char **regexes; /* For patterns that need fnmatch */
size_t regexes_len;
char **invert_regexes; /* For "!" patterns */
size_t invert_regexes_len;
char **slash_regexes;
size_t slash_regexes_len;
@ -29,7 +27,7 @@ struct ignores {
};
typedef struct ignores ignores;
extern ignores *root_ignores;
ignores *root_ignores;
extern const char *evil_hardcoded_ignore_files[];
extern const char *ignore_pattern_files[];

View file

@ -7,67 +7,47 @@
lang_spec_t langs[] = {
{ "actionscript", { "as", "mxml" } },
{ "ada", { "ada", "adb", "ads" } },
{ "asciidoc", { "adoc", "ad", "asc", "asciidoc" } },
{ "apl", { "apl" } },
{ "asm", { "asm", "s" } },
{ "asp", { "asp", "asa", "aspx", "asax", "ashx", "ascx", "asmx" } },
{ "aspx", { "asp", "asa", "aspx", "asax", "ashx", "ascx", "asmx" } },
{ "batch", { "bat", "cmd" } },
{ "bazel", { "bazel" } },
{ "bitbake", { "bb", "bbappend", "bbclass", "inc" } },
{ "bro", { "bro", "bif" } },
{ "cc", { "c", "h", "xs" } },
{ "cfmx", { "cfc", "cfm", "cfml" } },
{ "chpl", { "chpl" } },
{ "clojure", { "clj", "cljs", "cljc", "cljx", "edn" } },
{ "clojure", { "clj", "cljs", "cljc", "cljx" } },
{ "coffee", { "coffee", "cjsx" } },
{ "config", { "config" } },
{ "coq", { "coq", "g", "v" } },
{ "cpp", { "cpp", "cc", "C", "cxx", "m", "hpp", "hh", "h", "H", "hxx", "tpp" } },
{ "crystal", { "cr", "ecr" } },
{ "csharp", { "cs" } },
{ "cshtml", { "cshtml" } },
{ "css", { "css" } },
{ "cython", { "pyx", "pxd", "pxi" } },
{ "delphi", { "pas", "int", "dfm", "nfm", "dof", "dpk", "dpr", "dproj", "groupproj", "bdsgroup", "bdsproj" } },
{ "dlang", { "d", "di" } },
{ "dot", { "dot", "gv" } },
{ "dts", { "dts", "dtsi" } },
{ "ebuild", { "ebuild", "eclass" } },
{ "elisp", { "el" } },
{ "elixir", { "ex", "eex", "exs" } },
{ "elm", { "elm" } },
{ "erlang", { "erl", "hrl" } },
{ "factor", { "factor" } },
{ "fortran", { "f", "F", "f77", "f90", "F90", "f95", "f03", "for", "ftn", "fpp", "FPP" } },
{ "fortran", { "f", "f77", "f90", "f95", "f03", "for", "ftn", "fpp" } },
{ "fsharp", { "fs", "fsi", "fsx" } },
{ "gettext", { "po", "pot", "mo" } },
{ "glsl", { "vert", "tesc", "tese", "geom", "frag", "comp" } },
{ "go", { "go" } },
{ "gradle", { "gradle" } },
{ "groovy", { "groovy", "gtmpl", "gpp", "grunit", "gradle" } },
{ "groovy", { "groovy", "gtmpl", "gpp", "grunit" } },
{ "haml", { "haml" } },
{ "handlebars", { "hbs" } },
{ "haskell", { "hs", "hsig", "lhs" } },
{ "haxe", { "hx" } },
{ "haskell", { "hs", "lhs" } },
{ "hh", { "h" } },
{ "html", { "htm", "html", "shtml", "xhtml" } },
{ "idris", { "idr", "ipkg", "lidr" } },
{ "ini", { "ini" } },
{ "ipython", { "ipynb" } },
{ "isabelle", { "thy" } },
{ "j", { "ijs" } },
{ "jade", { "jade" } },
{ "java", { "java", "properties" } },
{ "jinja2", { "j2" } },
{ "js", { "es6", "js", "jsx", "vue" } },
{ "js", { "js", "jsx", "vue" } },
{ "json", { "json" } },
{ "jsp", { "jsp", "jspx", "jhtm", "jhtml", "jspf", "tag", "tagf" } },
{ "jsp", { "jsp", "jspx", "jhtm", "jhtml" } },
{ "julia", { "jl" } },
{ "kotlin", { "kt" } },
{ "less", { "less" } },
{ "liquid", { "liquid" } },
{ "lisp", { "lisp", "lsp" } },
{ "log", { "log" } },
{ "lua", { "lua" } },
{ "m4", { "m4" } },
{ "make", { "Makefiles", "mk", "mak" } },
@ -76,36 +56,26 @@ lang_spec_t langs[] = {
{ "mason", { "mas", "mhtml", "mpl", "mtxt" } },
{ "matlab", { "m" } },
{ "mathematica", { "m", "wl" } },
{ "md", { "markdown", "mdown", "mdwn", "mkdn", "mkd", "md" } },
{ "mercury", { "m", "moo" } },
{ "naccess", { "asa", "rsa" } },
{ "nim", { "nim" } },
{ "nix", { "nix" } },
{ "objc", { "m", "h" } },
{ "objcpp", { "mm", "h" } },
{ "ocaml", { "ml", "mli", "mll", "mly" } },
{ "octave", { "m" } },
{ "org", { "org" } },
{ "parrot", { "pir", "pasm", "pmc", "ops", "pod", "pg", "tg" } },
{ "pdb", { "pdb" } },
{ "perl", { "pl", "pm", "pm6", "pod", "t" } },
{ "php", { "php", "phpt", "php3", "php4", "php5", "phtml" } },
{ "pike", { "pike", "pmod" } },
{ "plist", { "plist" } },
{ "plone", { "pt", "cpt", "metadata", "cpy", "py", "xml", "zcml" } },
{ "powershell", { "ps1" } },
{ "proto", { "proto" } },
{ "ps1", { "ps1" } },
{ "pug", { "pug" } },
{ "puppet", { "pp" } },
{ "python", { "py" } },
{ "qml", { "qml" } },
{ "racket", { "rkt", "ss", "scm" } },
{ "rake", { "Rakefile" } },
{ "razor", { "cshtml" } },
{ "restructuredtext", { "rst" } },
{ "rs", { "rs" } },
{ "r", { "r", "R", "Rmd", "Rnw", "Rtex", "Rrst" } },
{ "r", { "R", "Rmd", "Rnw", "Rtex", "Rrst" } },
{ "rdoc", { "rdoc" } },
{ "ruby", { "rb", "rhtml", "rjs", "rxml", "erb", "rake", "spec" } },
{ "rust", { "rs" } },
@ -117,32 +87,24 @@ lang_spec_t langs[] = {
{ "smalltalk", { "st" } },
{ "sml", { "sml", "fun", "mlb", "sig" } },
{ "sql", { "sql", "ctl" } },
{ "stata", { "do", "ado" } },
{ "stylus", { "styl" } },
{ "swift", { "swift" } },
{ "tcl", { "tcl", "itcl", "itk" } },
{ "terraform", { "tf", "tfvars" } },
{ "tex", { "tex", "cls", "sty" } },
{ "thrift", { "thrift" } },
{ "tla", { "tla" } },
{ "tt", { "tt", "tt2", "ttml" } },
{ "toml", { "toml" } },
{ "ts", { "ts", "tsx" } },
{ "twig", { "twig" } },
{ "vala", { "vala", "vapi" } },
{ "vb", { "bas", "cls", "frm", "ctl", "vb", "resx" } },
{ "velocity", { "vm", "vtl", "vsl" } },
{ "verilog", { "v", "vh", "sv", "svh" } },
{ "verilog", { "v", "vh", "sv" } },
{ "vhdl", { "vhd", "vhdl" } },
{ "vim", { "vim" } },
{ "vue", { "vue" } },
{ "wix", { "wxi", "wxs" } },
{ "wsdl", { "wsdl" } },
{ "wadl", { "wadl" } },
{ "xml", { "xml", "dtd", "xsl", "xslt", "xsd", "ent", "tld", "plist", "wsdl" } },
{ "yaml", { "yaml", "yml" } },
{ "zeek", { "zeek", "bro", "bif" } },
{ "zephir", { "zep" } }
{ "xml", { "xml", "dtd", "xsl", "xslt", "ent", "tld" } },
{ "yaml", { "yaml", "yml" } }
};
size_t get_lang_count() {

View file

@ -4,7 +4,6 @@
#include "log.h"
#include "util.h"
pthread_mutex_t print_mtx = PTHREAD_MUTEX_INITIALIZER;
static enum log_level log_threshold = LOG_LEVEL_ERR;
void set_log_level(enum log_level threshold) {

View file

@ -9,7 +9,7 @@
#include <pthread.h>
#endif
extern pthread_mutex_t print_mtx;
pthread_mutex_t print_mtx;
enum log_level {
LOG_LEVEL_DEBUG = 10,

View file

@ -1,5 +1,7 @@
#include "config.h"
#include <ctype.h>
#include <pcre.h>
#include <pcre2.h>
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
@ -9,20 +11,10 @@
#include <windows.h>
#endif
#include "config.h"
#ifdef HAVE_SYS_CPUSET_H
#include <sys/cpuset.h>
#endif
#ifdef HAVE_PTHREAD_H
#include <pthread.h>
#endif
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && defined(__FreeBSD__)
#include <pthread_np.h>
#endif
#include "log.h"
#include "options.h"
#include "search.h"
@ -37,7 +29,7 @@ int main(int argc, char **argv) {
char **base_paths = NULL;
char **paths = NULL;
int i;
int pcre_opts = PCRE_MULTILINE;
int pcre_opts = PCRE2_MULTILINE;
int study_opts = 0;
worker_t *workers = NULL;
int workers_len;
@ -57,19 +49,22 @@ int main(int argc, char **argv) {
out_fd = stdout;
parse_options(argc, argv, &base_paths, &paths);
log_debug("PCRE Version: %s", pcre_version());
log_debug("PCRE Version: %s", pcre2_version());
if (opts.stats) {
memset(&stats, 0, sizeof(stats));
gettimeofday(&(stats.time_start), NULL);
}
#ifdef USE_PCRE_JIT
int has_jit = 0;
pcre_config(PCRE_CONFIG_JIT, &has_jit);
if (has_jit) {
study_opts |= PCRE_STUDY_JIT_COMPILE;
}
#endif
/*
TODO: call pcre2_jit_compile in compile_study
// #ifdef USE_PCRE2_JIT
// int has_jit = 0;
// pcre2_config(PCRE2_CONFIG_JIT, &has_jit);
// if (has_jit) {
// study_opts |= PCRE2_STUDY_JIT_COMPILE;
// }
// #endif
*/
#ifdef _WIN32
{
@ -131,7 +126,7 @@ int main(int argc, char **argv) {
}
} else {
if (opts.casing == CASE_INSENSITIVE) {
pcre_opts |= PCRE_CASELESS;
pcre_opts |= PCRE2_CASELESS;
}
if (opts.word_regexp) {
char *word_regexp_query;
@ -140,7 +135,7 @@ int main(int argc, char **argv) {
opts.query = word_regexp_query;
opts.query_len = strlen(opts.query);
}
compile_study(&opts.re, &opts.re_extra, opts.query, pcre_opts, study_opts);
compile_study(&opts.re, &opts.re_ctx, opts.query, pcre_opts, study_opts);
}
if (opts.search_stream) {
@ -152,13 +147,9 @@ int main(int argc, char **argv) {
if (rv != 0) {
die("Error in pthread_create(): %s", strerror(rv));
}
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && (defined(USE_CPU_SET) || defined(HAVE_SYS_CPUSET_H))
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && defined(USE_CPU_SET)
if (opts.use_thread_affinity) {
#if defined(__linux__) || defined(__midipix__)
cpu_set_t cpu_set;
#elif __FreeBSD__
cpuset_t cpu_set;
#endif
CPU_ZERO(&cpu_set);
CPU_SET(i % num_cores, &cpu_set);
rv = pthread_setaffinity_np(workers[i].thread, sizeof(cpu_set), &cpu_set);
@ -185,7 +176,7 @@ int main(int argc, char **argv) {
log_debug("searching path %s for %s", paths[i], opts.query);
symhash = NULL;
ignores *ig = init_ignore(root_ignores, "", 0);
struct stat s = { .st_dev = 0 };
struct stat s = {.st_dev = 0 };
#ifndef _WIN32
/* The device is ignored if opts.one_dev is false, so it's fine
* to leave it at the default 0
@ -213,7 +204,7 @@ int main(int argc, char **argv) {
double time_diff = ((long)stats.time_end.tv_sec * 1000000 + stats.time_end.tv_usec) -
((long)stats.time_start.tv_sec * 1000000 + stats.time_start.tv_usec);
time_diff /= 1000000;
printf("%zu matches\n%zu files contained matches\n%zu files searched\n%zu bytes searched\n%f seconds\n",
printf("%ld matches\n%ld files contained matches\n%ld files searched\n%ld bytes searched\n%f seconds\n",
stats.total_matches, stats.total_file_matches, stats.total_files, stats.total_bytes, time_diff);
pthread_mutex_destroy(&stats_mtx);
}

View file

@ -1,3 +1,5 @@
#include "config.h"
#include <errno.h>
#include <limits.h>
#include <stdarg.h>
@ -8,7 +10,6 @@
#include <sys/stat.h>
#include <unistd.h>
#include "config.h"
#include "ignore.h"
#include "lang.h"
#include "log.h"
@ -20,8 +21,6 @@ const char *color_line_number = "\033[1;33m"; /* bold yellow */
const char *color_match = "\033[30;43m"; /* black with yellow background */
const char *color_path = "\033[1;32m"; /* bold green */
cli_options opts;
/* TODO: try to obey out_fd? */
void usage(void) {
printf("\n");
@ -59,14 +58,11 @@ Output Options:\n\
(Enabled by default)\n\
-C --context [LINES] Print lines before and after matches (Default: 2)\n\
--[no]group Same as --[no]break --[no]heading\n\
-g --filename-pattern PATTERN\n\
Print filenames matching PATTERN\n\
-g PATTERN Print filenames matching PATTERN\n\
-l --files-with-matches Only print filenames that contain matches\n\
(don't print the matching lines)\n\
-L --files-without-matches\n\
Only print filenames that don't contain matches\n\
--print-all-files Print headings for all files searched, even those that\n\
don't contain matches\n\
--[no]numbers Print line numbers. Default is to omit line numbers\n\
when searching streams\n\
-o --only-matching Prints only the matching part of the lines\n\
@ -129,7 +125,7 @@ void print_version(void) {
char lzma = '-';
char zlib = '-';
#ifdef USE_PCRE_JIT
#ifdef USE_PCRE2_JIT
jit = '+';
#endif
#ifdef HAVE_LZMA_H
@ -145,29 +141,18 @@ void print_version(void) {
}
void init_options(void) {
char *term = getenv("TERM");
memset(&opts, 0, sizeof(opts));
opts.casing = CASE_DEFAULT;
opts.color = TRUE;
if (term && !strcmp(term, "dumb")) {
opts.color = FALSE;
}
opts.color_win_ansi = FALSE;
opts.max_matches_per_file = 0;
opts.max_search_depth = DEFAULT_MAX_SEARCH_DEPTH;
#if defined(__APPLE__) || defined(__MACH__)
/* mamp() is slower than normal read() on macos. default to off */
opts.mmap = FALSE;
#else
opts.mmap = TRUE;
#endif
opts.multiline = TRUE;
opts.width = 0;
opts.path_sep = '\n';
opts.print_break = TRUE;
opts.print_path = PATH_PRINT_DEFAULT;
opts.print_all_paths = FALSE;
opts.print_line_numbers = TRUE;
opts.recurse_dirs = TRUE;
opts.color_path = ag_strdup(color_path);
@ -185,24 +170,23 @@ void cleanup_options(void) {
free(opts.query);
}
pcre_free(opts.re);
if (opts.re_extra) {
/* Using pcre_free_study on pcre_extra* can segfault on some versions of PCRE */
pcre_free(opts.re_extra);
pcre2_code_free(opts.re);
if (opts.re_ctx) {
pcre2_compile_context_free(opts.re_ctx);
}
if (opts.ackmate_dir_filter) {
pcre_free(opts.ackmate_dir_filter);
pcre2_code_free(opts.ackmate_dir_filter);
}
if (opts.ackmate_dir_filter_extra) {
pcre_free(opts.ackmate_dir_filter_extra);
if (opts.ackmate_dir_filter_ctx) {
pcre2_compile_context_free(opts.ackmate_dir_filter_ctx);
}
if (opts.file_search_regex) {
pcre_free(opts.file_search_regex);
pcre2_code_free(opts.file_search_regex);
}
if (opts.file_search_regex_extra) {
pcre_free(opts.file_search_regex_extra);
if (opts.file_search_regex_ctx) {
pcre2_compile_context_free(opts.file_search_regex_ctx);
}
}
@ -210,7 +194,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
int ch;
size_t i;
int path_len = 0;
int base_path_len = 0;
int useless = 0;
int group = 1;
int help = 0;
@ -258,7 +241,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
{ "debug", no_argument, NULL, 'D' },
{ "depth", required_argument, NULL, 0 },
{ "filename", no_argument, NULL, 0 },
{ "filename-pattern", required_argument, NULL, 'g' },
{ "file-search-regex", required_argument, NULL, 'G' },
{ "files-with-matches", no_argument, NULL, 'l' },
{ "files-without-matches", no_argument, NULL, 'L' },
@ -315,7 +297,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
{ "passthru", no_argument, &opts.passthrough, 1 },
{ "path-to-ignore", required_argument, NULL, 'p' },
{ "print0", no_argument, NULL, '0' },
{ "print-all-files", no_argument, NULL, 0 },
{ "print-long-lines", no_argument, &opts.print_long_lines, 1 },
{ "recurse", no_argument, NULL, 'r' },
{ "search-binary", no_argument, &opts.search_binary_files, 1 },
@ -434,7 +415,7 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
case 'g':
needs_query = accepts_query = 0;
opts.match_files = 1;
/* fall through */
/* Fall through so regex is built */
case 'G':
if (file_search_regex) {
log_err("File search regex (-g or -G) already specified.");
@ -453,9 +434,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
opts.casing = CASE_INSENSITIVE;
break;
case 'L':
opts.print_nonmatching_files = 1;
opts.print_path = PATH_PRINT_TOP;
break;
opts.invert_match = 1;
/* fall through */
case 'l':
needs_query = 0;
opts.print_filename_only = 1;
@ -524,7 +504,7 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
break;
case 0: /* Long option */
if (strcmp(longopts[opt_index].name, "ackmate-dir-filter") == 0) {
compile_study(&opts.ackmate_dir_filter, &opts.ackmate_dir_filter_extra, optarg, 0, 0);
compile_study(&opts.ackmate_dir_filter, &opts.ackmate_dir_filter_ctx, optarg, 0, 0);
break;
} else if (strcmp(longopts[opt_index].name, "depth") == 0) {
opts.max_search_depth = atoi(optarg);
@ -552,9 +532,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
} else if (strcmp(longopts[opt_index].name, "pager") == 0) {
opts.pager = optarg;
break;
} else if (strcmp(longopts[opt_index].name, "print-all-files") == 0) {
opts.print_all_paths = TRUE;
break;
} else if (strcmp(longopts[opt_index].name, "workers") == 0) {
opts.workers = atoi(optarg);
break;
@ -597,7 +574,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
}
log_err("option %s does not take a value", longopts[opt_index].name);
/* fall through */
default:
usage();
exit(1);
@ -611,21 +587,21 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
if (file_search_regex) {
int pcre_opts = 0;
if (opts.casing == CASE_INSENSITIVE || (opts.casing == CASE_SMART && is_lowercase(file_search_regex))) {
pcre_opts |= PCRE_CASELESS;
pcre_opts |= PCRE2_CASELESS;
}
if (opts.word_regexp) {
char *old_file_search_regex = file_search_regex;
ag_asprintf(&file_search_regex, "\\b%s\\b", file_search_regex);
free(old_file_search_regex);
}
compile_study(&opts.file_search_regex, &opts.file_search_regex_extra, file_search_regex, pcre_opts, 0);
compile_study(&opts.file_search_regex, &opts.file_search_regex_ctx, file_search_regex, pcre_opts, 0);
free(file_search_regex);
}
if (has_filetype) {
num_exts = combine_file_extensions(ext_index, lang_num, &extensions);
lang_regex = make_lang_regex(extensions, num_exts);
compile_study(&opts.file_search_regex, &opts.file_search_regex_extra, lang_regex, 0, 0);
compile_study(&opts.file_search_regex, &opts.file_search_regex_ctx, lang_regex, 0, 0);
}
if (extensions) {
@ -713,10 +689,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
const char *config_home = getenv("XDG_CONFIG_HOME");
if (config_home) {
ag_asprintf(&gitconfig_res, "%s/%s", config_home, "git/ignore");
} else if (home_dir) {
ag_asprintf(&gitconfig_res, "%s/%s", home_dir, ".config/git/ignore");
} else {
gitconfig_res = ag_strdup("");
ag_asprintf(&gitconfig_res, "%s/%s", home_dir, ".config/git/ignore");
}
}
log_debug("global core.excludesfile: %s", gitconfig_res);
@ -774,13 +748,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
}
if (accepts_query && argc > 0) {
if (!needs_query && strlen(argv[0]) == 0) {
// use default query
opts.query = ag_strdup(".");
} else {
// use the provided query
opts.query = ag_strdup(argv[0]);
}
// use the provided query
opts.query = ag_strdup(argv[0]);
argc--;
argv++;
} else if (!needs_query) {
@ -801,7 +770,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
}
char *path = NULL;
char *base_path = NULL;
#ifdef PATH_MAX
char *tmp = NULL;
#endif
@ -819,20 +787,10 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
(*paths)[i] = path;
#ifdef PATH_MAX
tmp = ag_malloc(PATH_MAX);
base_path = realpath(path, tmp);
(*base_paths)[i] = realpath(path, tmp);
#else
base_path = realpath(path, NULL);
(*base_paths)[i] = realpath(path, NULL);
#endif
if (base_path) {
base_path_len = strlen(base_path);
/* add trailing slash */
if (base_path_len > 1 && base_path[base_path_len - 1] != '/') {
base_path = ag_realloc(base_path, base_path_len + 2);
base_path[base_path_len] = '/';
base_path[base_path_len + 1] = '\0';
}
}
(*base_paths)[i] = base_path;
}
/* Make sure we search these paths instead of stdin. */
opts.search_stream = 0;

View file

@ -1,10 +1,12 @@
#ifndef OPTIONS_H
#define OPTIONS_H
#include "config.h"
#include <getopt.h>
#include <sys/stat.h>
#include <pcre.h>
#include <pcre2.h>
#define DEFAULT_AFTER_LEN 2
#define DEFAULT_BEFORE_LEN 2
@ -28,15 +30,15 @@ enum path_print_behavior {
typedef struct {
int ackmate;
pcre *ackmate_dir_filter;
pcre_extra *ackmate_dir_filter_extra;
pcre2_code *ackmate_dir_filter;
pcre2_compile_context *ackmate_dir_filter_ctx;
size_t after;
size_t before;
enum case_behavior casing;
const char *file_search_string;
int match_files;
pcre *file_search_regex;
pcre_extra *file_search_regex_extra;
pcre2_code *file_search_regex;
pcre2_compile_context *file_search_regex_ctx;
int color;
char *color_line_number;
char *color_match;
@ -60,14 +62,12 @@ typedef struct {
int print_break;
int print_count;
int print_filename_only;
int print_nonmatching_files;
int print_path;
int print_all_paths;
int print_line_numbers;
int print_long_lines; /* TODO: support this in print.c */
int passthrough;
pcre *re;
pcre_extra *re_extra;
pcre2_code *re;
pcre2_compile_context *re_ctx;
int recurse_dirs;
int search_all_files;
int skip_vcs_ignores;
@ -92,7 +92,7 @@ typedef struct {
} cli_options;
/* global options. parse_options gives it sane values, everything else reads from it */
extern cli_options opts;
cli_options opts;
typedef struct option option_t;

View file

@ -26,7 +26,6 @@ __thread struct print_context {
size_t prev_line;
size_t last_prev_line;
size_t prev_line_offset;
size_t line_preceding_current_match_offset;
size_t lines_since_last_match;
size_t last_printed_match;
int in_a_match;
@ -42,7 +41,6 @@ void print_init_context(void) {
print_context.prev_line = 0;
print_context.last_prev_line = 0;
print_context.prev_line_offset = 0;
print_context.line_preceding_current_match_offset = 0;
print_context.lines_since_last_match = INT_MAX;
print_context.last_printed_match = 0;
print_context.in_a_match = FALSE;
@ -150,7 +148,6 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
ssize_t lines_to_print = 0;
char sep = '-';
size_t i, j;
int blanks_between_matches = opts.context || opts.after || opts.before;
if (opts.ackmate || opts.vimgrep) {
sep = ':';
@ -176,7 +173,7 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
if (cur_match < matches_len && i == matches[cur_match].start) {
print_context.in_a_match = TRUE;
/* We found the start of a match */
if (cur_match > 0 && blanks_between_matches && print_context.lines_since_last_match > (opts.before + opts.after + 1)) {
if (cur_match > 0 && opts.context && print_context.lines_since_last_match > (opts.before + opts.after + 1)) {
fprintf(out_fd, "--\n");
}
@ -225,10 +222,14 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
/* print headers for ackmate to parse */
print_line_number(print_context.line, ';');
for (; print_context.last_printed_match < cur_match; print_context.last_printed_match++) {
size_t start = matches[print_context.last_printed_match].start - print_context.line_preceding_current_match_offset;
fprintf(out_fd, "%lu %lu",
/* Don't print negative offsets. This isn't quite right, but not many people use --ackmate */
long start = (long)(matches[print_context.last_printed_match].start - print_context.prev_line_offset);
if (start < 0) {
start = 0;
}
fprintf(out_fd, "%li %li",
start,
matches[print_context.last_printed_match].end - matches[print_context.last_printed_match].start);
(long)(matches[print_context.last_printed_match].end - matches[print_context.last_printed_match].start));
print_context.last_printed_match == cur_match - 1 ? fputc(':', out_fd) : fputc(',', out_fd);
}
print_line(buf, i, print_context.prev_line_offset);
@ -315,9 +316,6 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
print_trailing_context(path, &buf[print_context.prev_line_offset], i - print_context.prev_line_offset);
print_context.prev_line_offset = i + 1; /* skip the newline */
if (!print_context.in_a_match) {
print_context.line_preceding_current_match_offset = i + 1;
}
/* File doesn't end with a newline. Print one so the output is pretty. */
if (i == buf_len && buf[i - 1] != '\n') {

View file

@ -7,7 +7,6 @@ typedef struct {
const ignores *ig;
const char *base_path;
size_t base_path_len;
const char *path_start;
} scandir_baton_t;
typedef int (*filter_fp)(const char *path, const struct dirent *, void *);

View file

@ -2,32 +2,18 @@
#include "print.h"
#include "scandir.h"
size_t alpha_skip_lookup[256];
size_t *find_skip_lookup;
uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
work_queue_t *work_queue = NULL;
work_queue_t *work_queue_tail = NULL;
int done_adding_files = 0;
pthread_cond_t files_ready = PTHREAD_COND_INITIALIZER;
pthread_mutex_t stats_mtx = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t work_queue_mtx = PTHREAD_MUTEX_INITIALIZER;
symdir_t *symhash = NULL;
/* Returns: -1 if skipped, otherwise # of matches */
ssize_t search_buf(const char *buf, const size_t buf_len,
const char *dir_full_path) {
void search_buf(const char *buf, const size_t buf_len,
const char *dir_full_path) {
int binary = -1; /* 1 = yes, 0 = no, -1 = don't know */
size_t buf_offset = 0;
if (opts.search_stream) {
binary = 0;
} else if (!opts.search_binary_files && opts.mmap) { /* if not using mmap, binary files have already been skipped */
} else if (!opts.search_binary_files) {
binary = is_binary((const void *)buf, buf_len);
if (binary) {
log_debug("File %s is binary. Skipping...", dir_full_path);
return -1;
return;
}
}
@ -59,18 +45,18 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
matches_len = 1;
} else if (opts.literal) {
const char *match_ptr = buf;
strncmp_fp ag_strnstr_fp = get_strstr(opts.casing);
while (buf_offset < buf_len) {
/* hash_strnstr only for little-endian platforms that allow unaligned access */
#if defined(__i386__) || defined(__x86_64__)
/* Decide whether to fall back on boyer-moore */
if ((size_t)opts.query_len < 2 * sizeof(uint16_t) - 1 || opts.query_len >= UCHAR_MAX) {
match_ptr = boyer_moore_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup, opts.casing == CASE_INSENSITIVE);
} else {
if ((size_t)opts.query_len < 2 * sizeof(uint16_t) - 1 || opts.query_len >= UCHAR_MAX)
match_ptr = ag_strnstr_fp(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup);
else
match_ptr = hash_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, h_table, opts.casing == CASE_SENSITIVE);
}
#else
match_ptr = boyer_moore_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup, opts.casing == CASE_INSENSITIVE);
match_ptr = ag_strnstr_fp(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup);
#endif
if (match_ptr == NULL) {
@ -114,8 +100,11 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
} else {
int offset_vector[3];
if (opts.multiline) {
/* we just care about the match, not where the matches are */
return pcre2_match(opts.ackmate_dir_filter, dir_name, strlen(dir_name), 0, 0, NULL, NULL);
while (buf_offset < buf_len &&
(pcre_exec(opts.re, opts.re_extra, buf, buf_len, buf_offset, 0, offset_vector, 3)) >= 0) {
(pcre2_match(opts.re, buf, buf_len, buf_offset, 0, match_data)) >= 0) {
log_debug("Regex match found. File %s, offset %i bytes.", dir_full_path, offset_vector[0]);
buf_offset = offset_vector[1];
if (offset_vector[0] == offset_vector[1]) {
@ -143,7 +132,7 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
}
size_t line_offset = 0;
while (line_offset < line_len) {
int rv = pcre_exec(opts.re, opts.re_extra, line, line_len, line_offset, 0, offset_vector, 3);
int rv = pcre2_match(opts.re, opts.re_ctx, line, line_len, line_offset, 0, offset_vector, 3);
if (rv < 0) {
break;
}
@ -188,16 +177,25 @@ multiline_done:
pthread_mutex_unlock(&stats_mtx);
}
if (!opts.print_nonmatching_files && (matches_len > 0 || opts.print_all_paths)) {
if (matches_len > 0) {
if (binary == -1 && !opts.print_filename_only) {
binary = is_binary((const void *)buf, buf_len);
}
pthread_mutex_lock(&print_mtx);
if (opts.print_filename_only) {
if (opts.print_count) {
print_path_count(dir_full_path, opts.path_sep, (size_t)matches_len);
} else {
print_path(dir_full_path, opts.path_sep);
/* If the --files-without-matches or -L option is passed we should
* not print a matching line. This option currently sets
* opts.print_filename_only and opts.invert_match. Unfortunately
* setting the latter has the side effect of making matches.len = 1
* on a file-without-matches which is not desired behaviour. See
* GitHub issue 206 for the consequences if this behaviour is not
* checked. */
if (!opts.invert_match || matches_len < 2) {
if (opts.print_count) {
print_path_count(dir_full_path, opts.path_sep, (size_t)matches_len);
} else {
print_path(dir_full_path, opts.path_sep);
}
}
} else if (binary) {
print_binary_file_matches(dir_full_path);
@ -219,16 +217,11 @@ multiline_done:
if (matches_size > 0) {
free(matches);
}
/* FIXME: handle case where matches_len > SSIZE_MAX */
return (ssize_t)matches_len;
}
/* Return value: -1 if skipped, otherwise # of matches */
/* TODO: this will only match single lines. multi-line regexes silently don't match */
ssize_t search_stream(FILE *stream, const char *path) {
void search_stream(FILE *stream, const char *path) {
char *line = NULL;
ssize_t matches_count = 0;
ssize_t line_len = 0;
size_t line_cap = 0;
size_t i;
@ -236,17 +229,8 @@ ssize_t search_stream(FILE *stream, const char *path) {
print_init_context();
for (i = 1; (line_len = getline(&line, &line_cap, stream)) > 0; i++) {
ssize_t result;
opts.stream_line_num = i;
result = search_buf(line, line_len, path);
if (result > 0) {
if (matches_count == -1) {
matches_count = 0;
}
matches_count += result;
} else if (matches_count <= 0 && result == -1) {
matches_count = -1;
}
search_buf(line, line_len, path);
if (line[line_len - 1] == '\n') {
line_len--;
}
@ -255,35 +239,16 @@ ssize_t search_stream(FILE *stream, const char *path) {
free(line);
print_cleanup_context();
return matches_count;
}
void search_file(const char *file_full_path) {
int fd = -1;
int fd;
off_t f_len = 0;
char *buf = NULL;
struct stat statbuf;
int rv = 0;
int matches_count = -1;
FILE *fp = NULL;
rv = stat(file_full_path, &statbuf);
if (rv != 0) {
log_err("Skipping %s: Error fstat()ing file.", file_full_path);
goto cleanup;
}
if (opts.stdout_inode != 0 && opts.stdout_inode == statbuf.st_ino) {
log_debug("Skipping %s: stdout is redirected to it", file_full_path);
goto cleanup;
}
// handling only regular files and FIFOs
if (!S_ISREG(statbuf.st_mode) && !S_ISFIFO(statbuf.st_mode)) {
log_err("Skipping %s: Mode %u is not a file.", file_full_path, statbuf.st_mode);
goto cleanup;
}
fd = open(file_full_path, O_RDONLY);
if (fd < 0) {
/* XXXX: strerror is not thread-safe */
@ -291,7 +256,6 @@ void search_file(const char *file_full_path) {
goto cleanup;
}
// repeating stat check with file handle to prevent TOCTOU issue
rv = fstat(fd, &statbuf);
if (rv != 0) {
log_err("Skipping %s: Error fstat()ing file.", file_full_path);
@ -303,8 +267,7 @@ void search_file(const char *file_full_path) {
goto cleanup;
}
// handling only regular files and FIFOs
if (!S_ISREG(statbuf.st_mode) && !S_ISFIFO(statbuf.st_mode)) {
if ((statbuf.st_mode & S_IFMT) == 0) {
log_err("Skipping %s: Mode %u is not a file.", file_full_path, statbuf.st_mode);
goto cleanup;
}
@ -314,7 +277,7 @@ void search_file(const char *file_full_path) {
if (statbuf.st_mode & S_IFIFO) {
log_debug("%s is a named pipe. stream searching", file_full_path);
fp = fdopen(fd, "r");
matches_count = search_stream(fp, file_full_path);
search_stream(fp, file_full_path);
fclose(fp);
goto cleanup;
}
@ -323,18 +286,13 @@ void search_file(const char *file_full_path) {
if (f_len == 0) {
if (opts.query[0] == '.' && opts.query_len == 1 && !opts.literal && opts.search_all_files) {
matches_count = search_buf(buf, f_len, file_full_path);
search_buf(buf, f_len, file_full_path);
} else {
log_debug("Skipping %s: file is empty.", file_full_path);
}
goto cleanup;
}
if (!opts.literal && f_len > INT_MAX) {
log_err("Skipping %s: pcre_exec() can't handle files larger than %i bytes.", file_full_path, INT_MAX);
goto cleanup;
}
#ifdef _WIN32
{
HANDLE hmmap = CreateFileMapping(
@ -368,23 +326,9 @@ void search_file(const char *file_full_path) {
#endif
} else {
buf = ag_malloc(f_len);
ssize_t bytes_read = 0;
if (!opts.search_binary_files) {
bytes_read += read(fd, buf, ag_min(f_len, 512));
// Optimization: If skipping binary files, don't read the whole buffer before checking if binary or not.
if (is_binary(buf, f_len)) {
log_debug("File %s is binary. Skipping...", file_full_path);
goto cleanup;
}
}
while (bytes_read < f_len) {
bytes_read += read(fd, buf + bytes_read, f_len);
}
if (bytes_read != f_len) {
die("File %s read(): expected to read %u bytes but read %u", file_full_path, f_len, bytes_read);
size_t bytes_read = read(fd, buf, f_len);
if ((off_t)bytes_read != f_len) {
die("expected to read %u bytes but read %u", f_len, bytes_read);
}
}
#endif
@ -392,45 +336,29 @@ void search_file(const char *file_full_path) {
if (opts.search_zip_files) {
ag_compression_type zip_type = is_zipped(buf, f_len);
if (zip_type != AG_NO_COMPRESSION) {
#if HAVE_FOPENCOOKIE
log_debug("%s is a compressed file. stream searching", file_full_path);
fp = decompress_open(fd, "r", zip_type);
matches_count = search_stream(fp, file_full_path);
fclose(fp);
#else
int _buf_len = (int)f_len;
char *_buf = decompress(zip_type, buf, f_len, file_full_path, &_buf_len);
if (_buf == NULL || _buf_len == 0) {
log_err("Cannot decompress zipped file %s", file_full_path);
goto cleanup;
}
matches_count = search_buf(_buf, _buf_len, file_full_path);
search_buf(_buf, _buf_len, file_full_path);
free(_buf);
#endif
goto cleanup;
}
}
matches_count = search_buf(buf, f_len, file_full_path);
search_buf(buf, f_len, file_full_path);
cleanup:
if (opts.print_nonmatching_files && matches_count == 0) {
pthread_mutex_lock(&print_mtx);
print_path(file_full_path, opts.path_sep);
pthread_mutex_unlock(&print_mtx);
opts.match_found = 1;
}
print_cleanup_context();
if (buf != NULL) {
#ifdef _WIN32
UnmapViewOfFile(buf);
#else
if (opts.mmap) {
if (buf != MAP_FAILED) {
munmap(buf, f_len);
}
munmap(buf, f_len);
} else {
free(buf);
}
@ -446,6 +374,7 @@ void *search_file_worker(void *i) {
int worker_id = *(int *)i;
log_debug("Worker %i started", worker_id);
match_data = pcre2_match_data_create_from_pattern(re, NULL);
while (TRUE) {
pthread_mutex_lock(&work_queue_mtx);
while (work_queue == NULL) {
@ -533,8 +462,6 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
struct dirent *dir = NULL;
scandir_baton_t scandir_baton;
int results = 0;
size_t base_path_len = 0;
const char *path_start = path;
char *dir_full_path = NULL;
const char *ignore_file = NULL;
@ -550,7 +477,7 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
}
/* find .*ignore files to load ignore patterns from */
for (i = 0; opts.skip_vcs_ignores ? (i == 0) : (ignore_pattern_files[i] != NULL); i++) {
for (i = 0; opts.skip_vcs_ignores ? (i <= 1) : (ignore_pattern_files[i] != NULL); i++) {
ignore_file = ignore_pattern_files[i];
ag_asprintf(&dir_full_path, "%s/%s", path, ignore_file);
load_ignore_patterns(ig, dir_full_path);
@ -558,20 +485,9 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
dir_full_path = NULL;
}
/* path_start is the part of path that isn't in base_path
* base_path will have a trailing '/' because we put it there in parse_options
*/
base_path_len = base_path ? strlen(base_path) : 0;
for (i = 0; ((size_t)i < base_path_len) && (path[i]) && (base_path[i] == path[i]); i++) {
path_start = path + i + 1;
}
log_debug("search_dir: path is '%s', base_path is '%s', path_start is '%s'", path, base_path, path_start);
scandir_baton.ig = ig;
scandir_baton.base_path = base_path;
scandir_baton.base_path_len = base_path_len;
scandir_baton.path_start = path_start;
scandir_baton.base_path_len = base_path ? strlen(base_path) : 0;
results = ag_scandir(path, &dir_list, &filename_filter, &scandir_baton);
if (results == 0) {
log_debug("No results found in directory %s", path);
@ -626,7 +542,7 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
if (!is_directory(path, dir)) {
if (opts.file_search_regex) {
rc = pcre_exec(opts.file_search_regex, NULL, dir_full_path, strlen(dir_full_path),
rc = pcre2_match(opts.file_search_regex, NULL, dir_full_path, strlen(dir_full_path),
0, 0, offset_vector, 3);
if (rc < 0) { /* no match */
log_debug("Skipping %s due to file_search_regex.", dir_full_path);

View file

@ -1,11 +1,13 @@
#ifndef SEARCH_H
#define SEARCH_H
#include "config.h"
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <pcre.h>
#include <pcre2.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@ -17,8 +19,6 @@
#include <sys/stat.h>
#include <unistd.h>
#include "config.h"
#ifdef HAVE_PTHREAD_H
#include <pthread.h>
#endif
@ -31,9 +31,9 @@
#include "uthash.h"
#include "util.h"
extern size_t alpha_skip_lookup[256];
extern size_t *find_skip_lookup;
extern uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
size_t alpha_skip_lookup[256];
size_t *find_skip_lookup;
uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
struct work_queue_t {
char *path;
@ -41,12 +41,12 @@ struct work_queue_t {
};
typedef struct work_queue_t work_queue_t;
extern work_queue_t *work_queue;
extern work_queue_t *work_queue_tail;
extern int done_adding_files;
extern pthread_cond_t files_ready;
extern pthread_mutex_t stats_mtx;
extern pthread_mutex_t work_queue_mtx;
work_queue_t *work_queue;
work_queue_t *work_queue_tail;
int done_adding_files;
pthread_cond_t files_ready;
pthread_mutex_t stats_mtx;
pthread_mutex_t work_queue_mtx;
/* For symlink loop detection */
@ -64,11 +64,11 @@ typedef struct {
UT_hash_handle hh;
} symdir_t;
extern symdir_t *symhash;
symdir_t *symhash;
ssize_t search_buf(const char *buf, const size_t buf_len,
const char *dir_full_path);
ssize_t search_stream(FILE *stream, const char *path);
void search_buf(const char *buf, const size_t buf_len,
const char *dir_full_path);
void search_stream(FILE *stream, const char *path);
void search_file(const char *file_full_path);
void *search_file_worker(void *i);

View file

@ -457,34 +457,24 @@ typedef unsigned char uint8_t;
switch (_hj_k) { \
case 11: \
hashv += ((unsigned)_hj_key[10] << 24); \
/* fall through */ \
case 10: \
hashv += ((unsigned)_hj_key[9] << 16); \
/* fall through */ \
case 9: \
hashv += ((unsigned)_hj_key[8] << 8); \
/* fall through */ \
case 8: \
_hj_j += ((unsigned)_hj_key[7] << 24); \
/* fall through */ \
case 7: \
_hj_j += ((unsigned)_hj_key[6] << 16); \
/* fall through */ \
case 6: \
_hj_j += ((unsigned)_hj_key[5] << 8); \
/* fall through */ \
case 5: \
_hj_j += _hj_key[4]; \
/* fall through */ \
case 4: \
_hj_i += ((unsigned)_hj_key[3] << 24); \
/* fall through */ \
case 3: \
_hj_i += ((unsigned)_hj_key[2] << 16); \
/* fall through */ \
case 2: \
_hj_i += ((unsigned)_hj_key[1] << 8); \
/* fall through */ \
case 1: \
_hj_i += _hj_key[0]; \
} \

View file

@ -1,3 +1,5 @@
#include "config.h"
#include <ctype.h>
#include <stdarg.h>
#include <stdio.h>
@ -5,7 +7,6 @@
#include <string.h>
#include <sys/stat.h>
#include "config.h"
#include "util.h"
#ifdef _WIN32
@ -21,8 +22,6 @@
} \
return ptr;
FILE *out_fd = NULL;
ag_stats stats;
void *ag_malloc(size_t size) {
void *ptr = malloc(size);
CHECK_AND_RETURN(ptr)
@ -150,13 +149,6 @@ size_t ag_max(size_t a, size_t b) {
return a;
}
size_t ag_min(size_t a, size_t b) {
if (b < a) {
return b;
}
return a;
}
void generate_hash(const char *find, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
int i;
for (i = f_len - sizeof(uint16_t); i >= 0; i--) {
@ -185,12 +177,12 @@ void generate_hash(const char *find, const size_t f_len, uint8_t *h_table, const
/* Boyer-Moore strstr */
const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len,
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup, const int case_insensitive) {
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup) {
ssize_t i;
size_t pos = f_len - 1;
while (pos < s_len) {
for (i = f_len - 1; i >= 0 && (case_insensitive ? tolower(s[pos]) : s[pos]) == find[i]; pos--, i--) {
for (i = f_len - 1; i >= 0 && s[pos] == find[i]; pos--, i--) {
}
if (i < 0) {
return s + pos + 1;
@ -201,9 +193,25 @@ const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_
return NULL;
}
// Clang's -fsanitize=alignment (included in -fsanitize=undefined) will flag
// the intentional unaligned access here, so suppress it for this function
NO_SANITIZE_ALIGNMENT const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
/* Copy-pasted from above. Yes I know this is bad. One day I might even fix it. */
const char *boyer_moore_strncasestr(const char *s, const char *find, const size_t s_len, const size_t f_len,
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup) {
ssize_t i;
size_t pos = f_len - 1;
while (pos < s_len) {
for (i = f_len - 1; i >= 0 && tolower(s[pos]) == find[i]; pos--, i--) {
}
if (i < 0) {
return s + pos + 1;
}
pos += ag_max(alpha_skip_lookup[(unsigned char)s[pos]], find_skip_lookup[i]);
}
return NULL;
}
const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
if (s_len < f_len)
return NULL;
@ -239,6 +247,17 @@ NO_SANITIZE_ALIGNMENT const char *hash_strnstr(const char *s, const char *find,
return NULL;
}
strncmp_fp get_strstr(enum case_behavior casing) {
strncmp_fp ag_strncmp_fp = &boyer_moore_strnstr;
if (casing == CASE_INSENSITIVE) {
ag_strncmp_fp = &boyer_moore_strncasestr;
}
return ag_strncmp_fp;
}
size_t invert_matches(const char *buf, const size_t buf_len, match_t matches[], size_t matches_len) {
size_t i;
size_t match_read_index = 0;
@ -313,19 +332,23 @@ void realloc_matches(match_t **matches, size_t *matches_size, size_t matches_len
*matches = ag_realloc(*matches, *matches_size * sizeof(match_t));
}
void compile_study(pcre **re, pcre_extra **re_extra, char *q, const int pcre_opts, const int study_opts) {
void compile_study(pcre2_code **re, pcre2_compile_context **re_ctx, char *q, const uint32_t pcre_opts, const int study_opts) {
const char *pcre_err = NULL;
int pcre_err_offset = 0;
*re = pcre_compile(q, pcre_opts, &pcre_err, &pcre_err_offset, NULL);
*re = pcre2_compile(q, pcre_opts, &pcre_err, &pcre_err_offset, NULL, NULL);
if (*re == NULL) {
die("Bad regex! pcre_compile() failed at position %i: %s\nIf you meant to search for a literal string, run ag with -Q",
// TODO: use pcre2_get_error_message()
die("Bad regex! pcre2_compile() failed at position %i: %s\nIf you meant to search for a literal string, run ag with -Q",
pcre_err_offset,
pcre_err);
}
*re_extra = pcre_study(*re, study_opts, &pcre_err);
if (*re_extra == NULL) {
log_debug("pcre_study returned nothing useful. Error: %s", pcre_err);
pcre2_jit_compile(*re, pcre_opts);
*re_ctx = NULL;
*re_ctx = pcre2_match_data_create_from_pattern(*re, NULL);
// *re_ctx = pcre2_init_context(NULL);
if (*re_ctx == NULL) {
log_debug("pcre2_init_context returned nothing useful. Error: %s", pcre_err);
}
}
@ -518,7 +541,7 @@ int is_symlink(const char *path, const struct dirent *d) {
int is_named_pipe(const char *path, const struct dirent *d) {
#ifdef HAVE_DIRENT_DTYPE
if (d->d_type != DT_UNKNOWN && d->d_type != DT_LNK) {
if (d->d_type != DT_UNKNOWN) {
return d->d_type == DT_FIFO || d->d_type == DT_SOCK;
}
#endif
@ -530,11 +553,7 @@ int is_named_pipe(const char *path, const struct dirent *d) {
return FALSE;
}
free(full_path);
return S_ISFIFO(s.st_mode)
#ifdef S_ISSOCK
|| S_ISSOCK(s.st_mode)
#endif
;
return S_ISFIFO(s.st_mode) || S_ISSOCK(s.st_mode);
}
void ag_asprintf(char **ret, const char *fmt, ...) {
@ -626,7 +645,7 @@ ssize_t getline(char **lineptr, size_t *n, FILE *stream) {
ssize_t buf_getline(const char **line, const char *buf, const size_t buf_len, const size_t buf_offset) {
const char *cur = buf + buf_offset;
ssize_t i;
for (i = 0; (buf_offset + i < buf_len) && cur[i] != '\n'; i++) {
for (i = 0; cur[i] != '\n' && (buf_offset + i < buf_len); i++) {
}
*line = cur;
return i;

View file

@ -2,17 +2,18 @@
#define UTIL_H
#include <dirent.h>
#include <pcre.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include "config.h"
#include <pcre2.h>
#include "log.h"
#include "options.h"
extern FILE *out_fd;
FILE *out_fd;
#ifndef TRUE
#define TRUE 1
@ -24,12 +25,6 @@ extern FILE *out_fd;
#define H_SIZE (64 * 1024)
#ifdef __clang__
#define NO_SANITIZE_ALIGNMENT __attribute__((no_sanitize("alignment")))
#else
#define NO_SANITIZE_ALIGNMENT
#endif
void *ag_malloc(size_t size);
void *ag_realloc(void *ptr, size_t size);
void *ag_calloc(size_t nelem, size_t elsize);
@ -42,16 +37,18 @@ typedef struct {
} match_t;
typedef struct {
size_t total_bytes;
size_t total_files;
size_t total_matches;
size_t total_file_matches;
long total_bytes;
long total_files;
long total_matches;
long total_file_matches;
struct timeval time_start;
struct timeval time_end;
} ag_stats;
extern ag_stats stats;
ag_stats stats;
typedef const char *(*strncmp_fp)(const char *, const char *, const size_t, const size_t, const size_t[], const size_t *);
/* Union to translate between chars and words without violating strict aliasing */
typedef union {
@ -69,15 +66,18 @@ void generate_hash(const char *find, const size_t f_len, uint8_t *H, const int c
/* max is already defined on spec-violating compilers such as MinGW */
size_t ag_max(size_t a, size_t b);
size_t ag_min(size_t a, size_t b);
const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len,
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup, const int case_insensitive);
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup);
const char *boyer_moore_strncasestr(const char *s, const char *find, const size_t s_len, const size_t f_len,
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup);
const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive);
strncmp_fp get_strstr(enum case_behavior opts);
size_t invert_matches(const char *buf, const size_t buf_len, match_t matches[], size_t matches_len);
void realloc_matches(match_t **matches, size_t *matches_size, size_t matches_len);
void compile_study(pcre **re, pcre_extra **re_extra, char *q, const int pcre_opts, const int study_opts);
void compile_study(pcre2_code **re, pcre2_compile_context **re_ctx, char *q, const uint32_t pcre_opts, const int study_opts);
int is_binary(const void *buf, const size_t buf_len);

View file

@ -1,403 +0,0 @@
#ifdef __FreeBSD__
#include <sys/endian.h>
#endif
#include <sys/types.h>
#ifdef __CYGWIN__
typedef _off64_t off64_t;
#endif
#include <assert.h>
#include <errno.h>
#include <inttypes.h>
#include <limits.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "config.h"
#ifdef HAVE_ERR_H
#include <err.h>
#endif
#ifdef HAVE_ZLIB_H
#include <zlib.h>
#endif
#ifdef HAVE_LZMA_H
#include <lzma.h>
#endif
#include "decompress.h"
#if HAVE_FOPENCOOKIE
#define min(a, b) ({ \
__typeof (a) _a = (a); \
__typeof (b) _b = (b); \
_a < _b ? _a : _b; })
static cookie_read_function_t zfile_read;
static cookie_seek_function_t zfile_seek;
static cookie_close_function_t zfile_close;
static const cookie_io_functions_t zfile_io = {
.read = zfile_read,
.write = NULL,
.seek = zfile_seek,
.close = zfile_close,
};
#define KB (1024)
struct zfile {
FILE *in; // Source FILE stream
uint64_t logic_offset, // Logical offset in output (forward seeks)
decode_offset, // Where we've decoded to
actual_len;
uint32_t outbuf_start;
ag_compression_type ctype;
union {
z_stream gz;
lzma_stream lzma;
} stream;
uint8_t inbuf[32 * KB];
uint8_t outbuf[256 * KB];
bool eof;
};
#define CAVAIL_IN(c) ((c)->ctype == AG_GZIP ? (c)->stream.gz.avail_in : (c)->stream.lzma.avail_in)
#define CNEXT_OUT(c) ((c)->ctype == AG_GZIP ? (c)->stream.gz.next_out : (c)->stream.lzma.next_out)
static int
zfile_cookie_init(struct zfile *cookie) {
#ifdef HAVE_LZMA_H
lzma_ret lzrc;
#endif
int rc;
assert(cookie->logic_offset == 0);
assert(cookie->decode_offset == 0);
cookie->actual_len = 0;
switch (cookie->ctype) {
#ifdef HAVE_ZLIB_H
case AG_GZIP:
memset(&cookie->stream.gz, 0, sizeof cookie->stream.gz);
rc = inflateInit2(&cookie->stream.gz, 32 + 15);
if (rc != Z_OK) {
log_err("Unable to initialize zlib: %s", zError(rc));
return EIO;
}
cookie->stream.gz.next_in = NULL;
cookie->stream.gz.avail_in = 0;
cookie->stream.gz.next_out = cookie->outbuf;
cookie->stream.gz.avail_out = sizeof cookie->outbuf;
break;
#endif
#ifdef HAVE_LZMA_H
case AG_XZ:
cookie->stream.lzma = (lzma_stream)LZMA_STREAM_INIT;
lzrc = lzma_auto_decoder(&cookie->stream.lzma, -1, 0);
if (lzrc != LZMA_OK) {
log_err("Unable to initialize lzma_auto_decoder: %d", lzrc);
return EIO;
}
cookie->stream.lzma.next_in = NULL;
cookie->stream.lzma.avail_in = 0;
cookie->stream.lzma.next_out = cookie->outbuf;
cookie->stream.lzma.avail_out = sizeof cookie->outbuf;
break;
#endif
default:
log_err("Unsupported compression type: %d", cookie->ctype);
return EINVAL;
}
cookie->outbuf_start = 0;
cookie->eof = false;
return 0;
}
static void
zfile_cookie_cleanup(struct zfile *cookie) {
switch (cookie->ctype) {
#ifdef HAVE_ZLIB_H
case AG_GZIP:
inflateEnd(&cookie->stream.gz);
break;
#endif
#ifdef HAVE_LZMA_H
case AG_XZ:
lzma_end(&cookie->stream.lzma);
break;
#endif
default:
/* Compiler false positive - unreachable. */
break;
}
}
/*
* Open compressed file 'path' as a (forward-)seekable (and rewindable),
* read-only stream.
*/
FILE *
decompress_open(int fd, const char *mode, ag_compression_type ctype) {
struct zfile *cookie;
FILE *res, *in;
int error;
cookie = NULL;
in = res = NULL;
if (strstr(mode, "w") || strstr(mode, "a")) {
errno = EINVAL;
goto out;
}
in = fdopen(fd, mode);
if (in == NULL)
goto out;
/*
* No validation of compression type is done -- file is assumed to
* match input. In Ag, the compression type is already detected, so
* that's ok.
*/
cookie = malloc(sizeof *cookie);
if (cookie == NULL) {
errno = ENOMEM;
goto out;
}
cookie->in = in;
cookie->logic_offset = 0;
cookie->decode_offset = 0;
cookie->ctype = ctype;
error = zfile_cookie_init(cookie);
if (error != 0) {
errno = error;
goto out;
}
res = fopencookie(cookie, mode, zfile_io);
out:
if (res == NULL) {
if (in != NULL)
fclose(in);
if (cookie != NULL)
free(cookie);
}
return res;
}
/*
* Return number of bytes into buf, 0 on EOF, -1 on error. Update stream
* offset.
*/
static ssize_t
zfile_read(void *cookie_, char *buf, size_t size) {
struct zfile *cookie = cookie_;
size_t nb, ignorebytes;
ssize_t total = 0;
lzma_ret lzret;
int ret;
assert(size <= SSIZE_MAX);
if (size == 0)
return 0;
if (cookie->eof)
return 0;
ret = Z_OK;
lzret = LZMA_OK;
ignorebytes = cookie->logic_offset - cookie->decode_offset;
assert(ignorebytes == 0);
do {
size_t inflated;
/* Drain output buffer first */
while (CNEXT_OUT(cookie) >
&cookie->outbuf[cookie->outbuf_start]) {
size_t left = CNEXT_OUT(cookie) -
&cookie->outbuf[cookie->outbuf_start];
size_t ignoreskip = min(ignorebytes, left);
size_t toread;
if (ignoreskip > 0) {
ignorebytes -= ignoreskip;
left -= ignoreskip;
cookie->outbuf_start += ignoreskip;
cookie->decode_offset += ignoreskip;
}
// Ran out of output before we seek()ed up.
if (ignorebytes > 0)
break;
toread = min(left, size);
memcpy(buf, &cookie->outbuf[cookie->outbuf_start],
toread);
buf += toread;
size -= toread;
left -= toread;
cookie->outbuf_start += toread;
cookie->decode_offset += toread;
cookie->logic_offset += toread;
total += toread;
if (size == 0)
break;
}
if (size == 0)
break;
/*
* If we have not satisfied read, the output buffer must be
* empty.
*/
assert(cookie->stream.gz.next_out ==
&cookie->outbuf[cookie->outbuf_start]);
if ((cookie->ctype == AG_XZ && lzret == LZMA_STREAM_END) ||
(cookie->ctype == AG_GZIP && ret == Z_STREAM_END)) {
cookie->eof = true;
break;
}
/* Read more input if empty */
if (CAVAIL_IN(cookie) == 0) {
nb = fread(cookie->inbuf, 1, sizeof cookie->inbuf,
cookie->in);
if (ferror(cookie->in)) {
warn("error read core");
exit(1);
}
if (nb == 0 && feof(cookie->in)) {
warn("truncated file");
exit(1);
}
if (cookie->ctype == AG_XZ) {
cookie->stream.lzma.avail_in = nb;
cookie->stream.lzma.next_in = cookie->inbuf;
} else {
cookie->stream.gz.avail_in = nb;
cookie->stream.gz.next_in = cookie->inbuf;
}
}
/* Reset stream state to beginning of output buffer */
if (cookie->ctype == AG_XZ) {
cookie->stream.lzma.next_out = cookie->outbuf;
cookie->stream.lzma.avail_out = sizeof cookie->outbuf;
} else {
cookie->stream.gz.next_out = cookie->outbuf;
cookie->stream.gz.avail_out = sizeof cookie->outbuf;
}
cookie->outbuf_start = 0;
if (cookie->ctype == AG_GZIP) {
ret = inflate(&cookie->stream.gz, Z_NO_FLUSH);
if (ret != Z_OK && ret != Z_STREAM_END) {
log_err("Found mem/data error while decompressing zlib stream: %s", zError(ret));
return -1;
}
} else {
lzret = lzma_code(&cookie->stream.lzma, LZMA_RUN);
if (lzret != LZMA_OK && lzret != LZMA_STREAM_END) {
log_err("Found mem/data error while decompressing xz/lzma stream: %d", lzret);
return -1;
}
}
inflated = CNEXT_OUT(cookie) - &cookie->outbuf[0];
cookie->actual_len += inflated;
} while (!ferror(cookie->in) && size > 0);
assert(total <= SSIZE_MAX);
return total;
}
static int
zfile_seek(void *cookie_, off64_t *offset_, int whence) {
struct zfile *cookie = cookie_;
off64_t new_offset = 0, offset = *offset_;
if (whence == SEEK_SET) {
new_offset = offset;
} else if (whence == SEEK_CUR) {
new_offset = (off64_t)cookie->logic_offset + offset;
} else {
/* SEEK_END not ok */
return -1;
}
if (new_offset < 0)
return -1;
/* Backward seeks to anywhere but 0 are not ok */
if (new_offset < (off64_t)cookie->logic_offset && new_offset != 0) {
return -1;
}
if (new_offset == 0) {
/* rewind(3) */
cookie->decode_offset = 0;
cookie->logic_offset = 0;
zfile_cookie_cleanup(cookie);
zfile_cookie_init(cookie);
} else if ((uint64_t)new_offset > cookie->logic_offset) {
/* Emulate forward seek by skipping ... */
char *buf;
const size_t bsz = 32 * 1024;
buf = malloc(bsz);
while ((uint64_t)new_offset > cookie->logic_offset) {
size_t diff = min(bsz,
(uint64_t)new_offset - cookie->logic_offset);
ssize_t err = zfile_read(cookie_, buf, diff);
if (err < 0) {
free(buf);
return -1;
}
/* Seek past EOF gets positioned at EOF */
if (err == 0) {
assert(cookie->eof);
new_offset = cookie->logic_offset;
break;
}
}
free(buf);
}
assert(cookie->logic_offset == (uint64_t)new_offset);
*offset_ = new_offset;
return 0;
}
static int
zfile_close(void *cookie_) {
struct zfile *cookie = cookie_;
zfile_cookie_cleanup(cookie);
fclose(cookie->in);
free(cookie);
return 0;
}
#endif /* HAVE_FOPENCOOKIE */

View file

@ -15,7 +15,14 @@ Search a big file:
234881024:hello7516192768
268435456:hello
Fail to regex search a big file:
Regex search a big file:
$ $TESTDIR/../../ag --nocolor --workers=1 --parallel 'hello.*' $TESTDIR/big_file.txt
ERR: Skipping */big_file.txt: pcre_exec() can't handle files larger than 2147483647 bytes. (glob)
[1]
33554432:hello1073741824
67108864:hello2147483648
100663296:hello3221225472
134217728:hello4294967296
167772160:hello5368709120
201326592:hello6442450944
234881024:hello7516192768
268435456:hello

View file

@ -12,7 +12,7 @@ Ensure column is correct:
# Test ackmate output. Not quite right, but at least offsets are in the
# ballpark instead of being 9 quintillion
$ ag --ackmate "lah\nb"
$ ag --ackmate "blah\nb"
:blah.txt
1;blah
2;1 5:blah2
2;0 6:blah2

View file

@ -1,9 +0,0 @@
Setup:
$ . $TESTDIR/setup.sh
$ printf "hello world\n" >test.txt
Verify ag runs with an empty environment:
$ env -i $TESTDIR/../ag --noaffinity --nocolor --workers=1 --parallel hello
test.txt:1:hello world

View file

@ -13,10 +13,10 @@ A genuine zero-length match should succeed:
1:foo
Empty files should be listed with --unrestricted --files-with-matches (-ul)
$ ag -lu --stats | sed '$d' | sort # Remove the last line about timing which will differ
2 files contained matches
2 files searched
2 matches
4 bytes searched
$ ag -lu --stats | sed '$d' # Remove the last line about timing which will differ
empty.txt
nonempty.txt
2 matches
2 files contained matches
2 files searched
4 bytes searched

View file

@ -3,10 +3,6 @@ Setup:
$ . $TESTDIR/setup.sh
$ printf 'foo\n' > ./foo.txt
$ printf 'bar\n' > ./bar.txt
$ printf 'foo\nbar\nbaz\n' > ./baz.txt
$ printf 'duck\nanother duck\nyet another duck\n' > ./duck.txt
$ cp duck.txt goose.txt
$ echo "GOOSE!!!" >> ./goose.txt
Files with matches:
@ -16,17 +12,8 @@ Files with matches:
foo.txt
$ ag --files-with-matches foo bar.txt
[1]
$ ag --files-with-matches foo foo.txt bar.txt baz.txt
foo.txt
baz.txt
$ ag --files-with-matches bar foo.txt bar.txt baz.txt
bar.txt
baz.txt
$ ag --files-with-matches foo bar.txt baz.txt
baz.txt
Files without matches:
(Prints names of files in which no line matches query)
$ ag --files-without-matches bar foo.txt
foo.txt
@ -34,30 +21,3 @@ Files without matches:
foo.txt
$ ag --files-without-matches bar bar.txt
[1]
$ ag --files-without-matches foo foo.txt bar.txt baz.txt
bar.txt
$ ag --files-without-matches bar foo.txt bar.txt baz.txt
foo.txt
Files with inverted matches:
(Prints names of files in which some line doesn't match query)
$ ag --files-with-matches --invert-match bar bar.txt
[1]
$ ag --files-with-matches --invert-match foo foo.txt bar.txt baz.txt
bar.txt
baz.txt
$ ag --files-with-matches --invert-match bar foo.txt bar.txt baz.txt
foo.txt
baz.txt
Files without inverted matches:
(Prints names of files in which no line doesn't match query,
i.e. where every line matches query)
$ ag --files-without-matches --invert-match duck duck.txt
duck.txt
$ ag --files-without-matches --invert-match duck goose.txt
[1]
$ ag --files-without-matches --invert-match duck duck.txt goose.txt
duck.txt

View file

@ -6,12 +6,10 @@ Setup:
$ printf 'targetA\n' > something.js
$ printf 'targetB\n' > aFile.test.txt
$ printf 'targetC\n' > aFile.txt
$ printf 'targetG\n' > something.min.js
$ mkdir -p subdir
$ printf 'targetD\n' > subdir/somethingElse.js
$ printf 'targetE\n' > subdir/anotherFile.test.txt
$ printf 'targetF\n' > subdir/anotherFile.txt
$ printf 'targetH\n' > subdir/somethingElse.min.js
Ignore patterns with single extension in root directory:
@ -23,11 +21,6 @@ Ignore patterns with multiple extensions in root directory:
$ ag "targetB"
[1]
*.js ignores *.min.js in root directory:
$ ag "targetG"
[1]
Do not ignore patterns with partial extensions in root directory:
$ ag "targetC"
@ -43,11 +36,6 @@ Ignore patterns with multiple extensions in subdirectory:
$ ag "targetE"
[1]
*.js ignores *.min.js in subdirectory:
$ ag "targetH"
[1]
Do not ignore patterns with partial extensions in subdirectory:
$ ag "targetF"

View file

@ -1,12 +0,0 @@
Setup:
$ . $TESTDIR/setup.sh
$ printf 'blah1\n' > ./printme.txt
$ printf 'blah2\n' > ./dontprintme.c
$ printf '*\n' > ./.ignore
$ printf '!*.txt\n' >> ./.ignore
Ignore .gitignore patterns but not .ignore patterns:
$ ag blah
printme.txt:1:blah1

View file

@ -1,19 +0,0 @@
Setup:
$ . $TESTDIR/setup.sh
$ mkdir -p subdir/ignoredir
$ mkdir ignoredir
$ printf 'match1\n' > subdir/ignoredir/file1.txt
$ printf 'match1\n' > ignoredir/file1.txt
$ printf '/ignoredir\n' > subdir/.ignore
Ignore file in subdir/ignoredir, but not in ignoredir:
$ ag match
ignoredir/file1.txt:1:match1
From subdir, ignore file in subdir/ignoredir:
$ cd subdir
$ ag match
[1]

View file

@ -12,30 +12,18 @@ Language types are output:
--ada
.ada .adb .ads
--asciidoc
.adoc .ad .asc .asciidoc
--apl
.apl
--asm
.asm .s
--asp
.asp .asa .aspx .asax .ashx .ascx .asmx
--aspx
.asp .asa .aspx .asax .ashx .ascx .asmx
--batch
.bat .cmd
--bazel
.bazel
--bitbake
.bb .bbappend .bbclass .inc
--bro
.bro .bif
--cc
.c .h .xs
@ -46,17 +34,11 @@ Language types are output:
.chpl
--clojure
.clj .cljs .cljc .cljx .edn
.clj .cljs .cljc .cljx
--coffee
.coffee .cjsx
--config
.config
--coq
.coq .g .v
--cpp
.cpp .cc .C .cxx .m .hpp .hh .h .H .hxx .tpp
@ -66,9 +48,6 @@ Language types are output:
--csharp
.cs
--cshtml
.cshtml
--css
.css
@ -78,15 +57,6 @@ Language types are output:
--delphi
.pas .int .dfm .nfm .dof .dpk .dpr .dproj .groupproj .bdsgroup .bdsproj
--dlang
.d .di
--dot
.dot .gv
--dts
.dts .dtsi
--ebuild
.ebuild .eclass
@ -96,9 +66,6 @@ Language types are output:
--elixir
.ex .eex .exs
--elm
.elm
--erlang
.erl .hrl
@ -106,7 +73,7 @@ Language types are output:
.factor
--fortran
.f .F .f77 .f90 .F90 .f95 .f03 .for .ftn .fpp .FPP
.f .f77 .f90 .f95 .f03 .for .ftn .fpp
--fsharp
.fs .fsi .fsx
@ -120,23 +87,14 @@ Language types are output:
--go
.go
--gradle
.gradle
--groovy
.groovy .gtmpl .gpp .grunit .gradle
.groovy .gtmpl .gpp .grunit
--haml
.haml
--handlebars
.hbs
--haskell
.hs .hsig .lhs
--haxe
.hx
.hs .lhs
--hh
.h
@ -144,38 +102,23 @@ Language types are output:
--html
.htm .html .shtml .xhtml
--idris
.idr .ipkg .lidr
--ini
.ini
--ipython
.ipynb
--isabelle
.thy
--j
.ijs
--jade
.jade
--java
.java .properties
--jinja2
.j2
--js
.es6 .js .jsx .vue
.js .jsx .vue
--json
.json
--jsp
.jsp .jspx .jhtm .jhtml .jspf .tag .tagf
.jsp .jspx .jhtm .jhtml
--julia
.jl
@ -192,9 +135,6 @@ Language types are output:
--lisp
.lisp .lsp
--log
.log
--lua
.lua
@ -219,21 +159,12 @@ Language types are output:
--mathematica
.m .wl
--md
.markdown .mdown .mdwn .mkdn .mkd .md
--mercury
.m .moo
--naccess
.asa .rsa
--nim
.nim
--nix
.nix
--objc
.m .h
@ -246,15 +177,9 @@ Language types are output:
--octave
.m
--org
.org
--parrot
.pir .pasm .pmc .ops .pod .pg .tg
--pdb
.pdb
--perl
.pl .pm .pm6 .pod .t
@ -264,24 +189,12 @@ Language types are output:
--pike
.pike .pmod
--plist
.plist
--plone
.pt .cpt .metadata .cpy .py .xml .zcml
--powershell
.ps1
--proto
.proto
--ps1
.ps1
--pug
.pug
--puppet
.pp
@ -297,9 +210,6 @@ Language types are output:
--rake
.Rakefile
--razor
.cshtml
--restructuredtext
.rst
@ -307,7 +217,7 @@ Language types are output:
.rs
--r
.r .R .Rmd .Rnw .Rtex .Rrst
.R .Rmd .Rnw .Rtex .Rrst
--rdoc
.rdoc
@ -342,9 +252,6 @@ Language types are output:
--sql
.sql .ctl
--stata
.do .ado
--stylus
.styl
@ -354,18 +261,9 @@ Language types are output:
--tcl
.tcl .itcl .itk
--terraform
.tf .tfvars
--tex
.tex .cls .sty
--thrift
.thrift
--tla
.tla
--tt
.tt .tt2 .ttml
@ -375,9 +273,6 @@ Language types are output:
--ts
.ts .tsx
--twig
.twig
--vala
.vala .vapi
@ -388,7 +283,7 @@ Language types are output:
.vm .vtl .vsl
--verilog
.v .vh .sv .svh
.v .vh .sv
--vhdl
.vhd .vhdl
@ -396,9 +291,6 @@ Language types are output:
--vim
.vim
--vue
.vue
--wix
.wxi .wxs
@ -409,14 +301,8 @@ Language types are output:
.wadl
--xml
.xml .dtd .xsl .xslt .xsd .ent .tld .plist .wsdl
.xml .dtd .xsl .xslt .ent .tld
--yaml
.yaml .yml
--zeek
.zeek .bro .bif
--zephir
.zep

View file

@ -1,16 +0,0 @@
Setup:
$ . $TESTDIR/setup.sh
$ printf 'foo\n' > ./foo.txt
$ printf 'bar\n' > ./bar.txt
$ printf 'baz\n' > ./baz.txt
All files:
$ ag --print-all-files --group foo | sort
1:foo
bar.txt
baz.txt
foo.txt

View file

@ -1,5 +1,5 @@
%define _bashcompdir %_sysconfdir/bash_completion.d
%define _zshcompdir %{_datadir}/zsh/site-functions
Name: the_silver_searcher
Version: @VERSION@
@ -12,8 +12,8 @@ URL: https://github.com/ggreer/%{name}
Source0: https://github.com/downloads/ggreer/%{name}/%{name}-%{version}.tar.gz
BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XXXXXX)
BuildRequires: pcre-devel, xz-devel, zlib-devel
Requires: pcre, xz, zlib
BuildRequires: pcre2-devel, xz-devel, zlib-devel
Requires: pcre2, xz, zlib
%description
The Silver Searcher
@ -29,7 +29,7 @@ How is it so fast?
* Searching for literals (no regex) uses Boyer-Moore-Horspool strstr.
* Files are mmap()ed instead of read into a buffer.
* If you're building with PCRE 8.21 or greater, regex searches use the JIT compiler.
* Ag calls pcre_study() before executing the regex on a jillion files.
* Ag calls pcre2_study() before executing the regex on a jillion files.
* Instead of calling fnmatch() on every pattern in your ignore files, non-regex patterns are loaded into an array and binary searched.
* Ag uses Pthreads to take advantage of multiple CPU cores and search files in parallel.
@ -62,7 +62,7 @@ rm -rf ${RPM_BUILD_ROOT}
%{_mandir}/*
%config %{_bashcompdir}/ag.bashcomp.sh
%config %{_datadir}/%{name}/completions/ag.bashcomp.sh
%config %{_datadir}/zsh/site-functions/_the_silver_searcher
%changelog
* Thu Dec 5 2013 Emily Strickland <code@emily.st> - 0.18.1-1