Compare commits
1 commit
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f90bf93036 |
41 changed files with 298 additions and 1400 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
|
@ -1,5 +1,4 @@
|
|||
*.dSYM
|
||||
*.gcda
|
||||
*.o
|
||||
*.plist
|
||||
.deps
|
||||
|
|
|
|||
15
.travis.yml
15
.travis.yml
|
|
@ -1,14 +1,9 @@
|
|||
language: c
|
||||
dist: xenial
|
||||
sudo: false
|
||||
|
||||
branches:
|
||||
only:
|
||||
- master
|
||||
- ppc64le
|
||||
arch:
|
||||
- amd64
|
||||
- ppc64le
|
||||
|
||||
compiler:
|
||||
- clang
|
||||
|
|
@ -27,12 +22,12 @@ addons:
|
|||
|
||||
env:
|
||||
global:
|
||||
- LLVM_VERSION=6.0.1
|
||||
- LLVM_VERSION=3.8.0
|
||||
- LLVM_PATH=$HOME/clang+llvm
|
||||
- CLANG_FORMAT=$LLVM_PATH/bin/clang-format
|
||||
|
||||
before_install:
|
||||
- wget http://llvm.org/releases/$LLVM_VERSION/clang+llvm-$LLVM_VERSION-x86_64-linux-gnu-ubuntu-16.04.tar.xz -O $LLVM_PATH.tar.xz
|
||||
- wget http://llvm.org/releases/$LLVM_VERSION/clang+llvm-$LLVM_VERSION-x86_64-linux-gnu-ubuntu-14.04.tar.xz -O $LLVM_PATH.tar.xz
|
||||
- mkdir $LLVM_PATH
|
||||
- tar xf $LLVM_PATH.tar.xz -C $LLVM_PATH --strip-components=1
|
||||
- export PATH=$HOME/.local/bin:$PATH
|
||||
|
|
@ -42,9 +37,3 @@ install:
|
|||
|
||||
script:
|
||||
- ./build.sh && make test
|
||||
|
||||
notifications:
|
||||
irc: 'chat.freenode.net#ag'
|
||||
on_success: change
|
||||
on_failure: always
|
||||
use_notice: true
|
||||
|
|
|
|||
|
|
@ -14,13 +14,3 @@ The test suite uses [Cram](https://bitheap.org/cram/). You'll need to build ag
|
|||
first, and then you can run the suite from the root of the repository :
|
||||
|
||||
make test
|
||||
|
||||
### Adding filetypes
|
||||
|
||||
Ag can search files which belong to a certain class for example `ag --html test`
|
||||
searches all files with the extension defined in [lang.c](src/lang.c).
|
||||
|
||||
If you want to add a new file 'class' to ag please modify [lang.c](src/lang.c) and [list_file_types.t](tests/list_file_types.t).
|
||||
|
||||
`lang.c` adds the functionality and `list_file_types.t` adds the test case.
|
||||
Without adding a test case the test __will__ fail.
|
||||
|
|
|
|||
|
|
@ -1,8 +1,8 @@
|
|||
ACLOCAL_AMFLAGS = ${ACLOCAL_FLAGS}
|
||||
|
||||
bin_PROGRAMS = ag
|
||||
ag_SOURCES = src/ignore.c src/ignore.h src/log.c src/log.h src/options.c src/options.h src/print.c src/print_w32.c src/print.h src/scandir.c src/scandir.h src/search.c src/search.h src/lang.c src/lang.h src/util.c src/util.h src/decompress.c src/decompress.h src/uthash.h src/main.c src/zfile.c
|
||||
ag_LDADD = ${PCRE_LIBS} ${LZMA_LIBS} ${ZLIB_LIBS} $(PTHREAD_LIBS)
|
||||
ag_SOURCES = src/ignore.c src/ignore.h src/log.c src/log.h src/options.c src/options.h src/print.c src/print_w32.c src/print.h src/scandir.c src/scandir.h src/search.c src/search.h src/lang.c src/lang.h src/util.c src/util.h src/decompress.c src/decompress.h src/uthash.h src/main.c
|
||||
ag_LDADD = ${PCRE2_LIBS} ${LZMA_LIBS} ${ZLIB_LIBS} $(PTHREAD_LIBS)
|
||||
|
||||
dist_man_MANS = doc/ag.1
|
||||
|
||||
|
|
@ -13,9 +13,6 @@ dist_zshcomp_DATA = _the_silver_searcher
|
|||
|
||||
EXTRA_DIST = Makefile.w32 LICENSE NOTICE the_silver_searcher.spec README.md
|
||||
|
||||
all:
|
||||
@$(MAKE) ag -r
|
||||
|
||||
test: ag
|
||||
cram -v tests/*.t
|
||||
if HAS_CLANG_FORMAT
|
||||
|
|
@ -30,4 +27,4 @@ test_big: ag
|
|||
test_fail: ag
|
||||
cram -v tests/fail/*.t
|
||||
|
||||
.PHONY : all clean test test_big test_fail
|
||||
.PHONY : all test clean
|
||||
|
|
|
|||
77
README.md
77
README.md
|
|
@ -6,7 +6,7 @@ A code searching tool similar to `ack`, with a focus on speed.
|
|||
|
||||
[](https://floobits.com/ggreer/ag/redirect)
|
||||
|
||||
[](https://webchat.freenode.net/?channels=ag)
|
||||
[](https://webchat.freenode.net/?channels=ag)
|
||||
|
||||
Do you know C? Want to improve ag? [I invite you to pair with me](http://geoff.greer.fm/2014/10/13/help-me-get-to-ag-10/).
|
||||
|
||||
|
|
@ -34,7 +34,7 @@ There are also [graphs of performance across releases](http://geoff.greer.fm/ag/
|
|||
* Files are `mmap()`ed instead of read into a buffer.
|
||||
* Literal string searching uses [Boyer-Moore strstr](https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm).
|
||||
* Regex searching uses [PCRE's JIT compiler](http://sljit.sourceforge.net/pcre.html) (if Ag is built with PCRE >=8.21).
|
||||
* Ag calls `pcre_study()` before executing the same regex on every file.
|
||||
* Ag calls `pcre2_study()` before executing the same regex on every file.
|
||||
* Instead of calling `fnmatch()` on every pattern in your ignore files, non-regex patterns are loaded into arrays and binary searched.
|
||||
|
||||
I've written several blog posts showing how I've improved performance. These include how I [added pthreads](http://geoff.greer.fm/2012/09/07/the-silver-searcher-adding-pthreads/), [wrote my own `scandir()`](http://geoff.greer.fm/2012/09/03/profiling-ag-writing-my-own-scandir/), [benchmarked every revision to find performance regressions](http://geoff.greer.fm/2012/08/25/the-silver-searcher-benchmarking-revisions/), and profiled with [gprof](http://geoff.greer.fm/2012/02/08/profiling-with-gprof/) and [Valgrind](http://geoff.greer.fm/2012/01/23/making-programs-faster-profiling/).
|
||||
|
|
@ -42,7 +42,7 @@ I've written several blog posts showing how I've improved performance. These inc
|
|||
|
||||
## Installing
|
||||
|
||||
### macOS
|
||||
### OS X
|
||||
|
||||
brew install the_silver_searcher
|
||||
|
||||
|
|
@ -67,7 +67,7 @@ or
|
|||
yum install epel-release.noarch the_silver_searcher
|
||||
* Gentoo
|
||||
|
||||
emerge -a sys-apps/the_silver_searcher
|
||||
emerge the_silver_searcher
|
||||
* Arch
|
||||
|
||||
pacman -S the_silver_searcher
|
||||
|
|
@ -76,20 +76,6 @@ or
|
|||
|
||||
sbopkg -i the_silver_searcher
|
||||
|
||||
* openSUSE
|
||||
|
||||
zypper install the_silver_searcher
|
||||
|
||||
* CentOS
|
||||
|
||||
yum install the_silver_searcher
|
||||
|
||||
* NixOS/Nix/Nixpkgs
|
||||
|
||||
nix-env -iA silver-searcher
|
||||
|
||||
* SUSE Linux Enterprise: Follow [these simple instructions](https://software.opensuse.org/download.html?project=utilities&package=the_silver_searcher).
|
||||
|
||||
|
||||
### BSD
|
||||
|
||||
|
|
@ -100,67 +86,37 @@ or
|
|||
|
||||
pkg_add the_silver_searcher
|
||||
|
||||
### Windows
|
||||
### Cygwin
|
||||
|
||||
* Win32/64
|
||||
|
||||
Unofficial daily builds are [available](https://github.com/k-takata/the_silver_searcher-win32).
|
||||
|
||||
* winget
|
||||
|
||||
winget install "The Silver Searcher"
|
||||
|
||||
Notes:
|
||||
- This installs a [release](https://github.com/JFLarvoire/the_silver_searcher/releases) of ag.exe optimized for Windows.
|
||||
- winget is intended to become the default package manager client for Windows.
|
||||
As of June 2020, it's still in beta, and can be installed using instructions [there](https://github.com/microsoft/winget-cli).
|
||||
- The setup script in the Ag's winget package installs ag.exe in the first directory that matches one of these criteria:
|
||||
1. Over a previous instance of ag.exe *from the same [origin](https://github.com/JFLarvoire/the_silver_searcher)* found in the PATH
|
||||
2. In the directory defined in environment variable bindir_%PROCESSOR_ARCHITECTURE%
|
||||
3. In the directory defined in environment variable bindir
|
||||
4. In the directory defined in environment variable windir
|
||||
|
||||
* Chocolatey
|
||||
|
||||
choco install ag
|
||||
* MSYS2
|
||||
|
||||
pacman -S mingw-w64-{i686,x86_64}-ag
|
||||
* Cygwin
|
||||
|
||||
Run the relevant [`setup-*.exe`](https://cygwin.com/install.html), and select "the\_silver\_searcher" in the "Utils" category.
|
||||
Run the relevant [`setup-*.exe`](https://cygwin.com/install.html), and select "the\_silver\_searcher" in the "Utils" category.
|
||||
|
||||
## Building from source
|
||||
|
||||
### Building master
|
||||
|
||||
1. Install dependencies (Automake, pkg-config, PCRE, LZMA):
|
||||
* macOS:
|
||||
1. Install dependencies (Automake, pkg-config, PCRE2, LZMA):
|
||||
* OS X:
|
||||
|
||||
brew install automake pkg-config pcre xz
|
||||
brew install automake pkg-config pcre2 xz
|
||||
or
|
||||
|
||||
port install automake pkgconfig pcre xz
|
||||
port install automake pkgconfig pcre2 xz
|
||||
* Ubuntu/Debian:
|
||||
|
||||
apt-get install -y automake pkg-config libpcre3-dev zlib1g-dev liblzma-dev
|
||||
apt-get install -y automake pkg-config libpcre2-dev zlib1g-dev liblzma-dev
|
||||
* Fedora:
|
||||
|
||||
yum -y install pkgconfig automake gcc zlib-devel pcre-devel xz-devel
|
||||
yum -y install pkgconfig automake gcc zlib-devel pcre2-devel xz-devel
|
||||
* CentOS:
|
||||
|
||||
yum -y groupinstall "Development Tools"
|
||||
yum -y install pcre-devel xz-devel zlib-devel
|
||||
* openSUSE:
|
||||
|
||||
zypper source-install --build-deps-only the_silver_searcher
|
||||
|
||||
yum -y install pcre2-devel xz-devel
|
||||
* Windows: It's complicated. See [this wiki page](https://github.com/ggreer/the_silver_searcher/wiki/Windows).
|
||||
2. Run the build script (which just runs aclocal, automake, etc):
|
||||
|
||||
./build.sh
|
||||
|
||||
On Windows (inside an msys/MinGW shell):
|
||||
On Windows (inside an msys/MinGW shell):
|
||||
|
||||
make -f Makefile.w32
|
||||
3. Make install:
|
||||
|
|
@ -185,7 +141,7 @@ You may need to use `sudo` or run as root for the make install.
|
|||
|
||||
### Vim
|
||||
|
||||
You can use Ag with [ack.vim](https://github.com/mileszs/ack.vim) by adding the following line to your `.vimrc`:
|
||||
You can use Ag with [ack.vim][] by adding the following line to your `.vimrc`:
|
||||
|
||||
let g:ackprg = 'ag --nogroup --nocolor --column'
|
||||
|
||||
|
|
@ -208,10 +164,9 @@ TextMate users can use Ag with [my fork](https://github.com/ggreer/AckMate) of t
|
|||
|
||||
## Other stuff you might like
|
||||
|
||||
* [Ack](https://github.com/petdance/ack3) - Better than grep. Without Ack, Ag would not exist.
|
||||
* [Ack](https://github.com/petdance/ack2) - Better than grep. Without Ack, Ag would not exist.
|
||||
* [ack.vim](https://github.com/mileszs/ack.vim)
|
||||
* [Exuberant Ctags](http://ctags.sourceforge.net/) - Faster than Ag, but it builds an index beforehand. Good for *really* big codebases.
|
||||
* [Git-grep](http://git-scm.com/docs/git-grep) - As fast as Ag but only works on git repos.
|
||||
* [fzf](https://github.com/junegunn/fzf) - A command-line fuzzy finder
|
||||
* [ripgrep](https://github.com/BurntSushi/ripgrep)
|
||||
* [Sack](https://github.com/sampson-chen/sack) - A utility that wraps Ack and Ag. It removes a lot of repetition from searching and opening matching files.
|
||||
|
|
|
|||
|
|
@ -67,7 +67,7 @@ _ag() {
|
|||
--parallel
|
||||
--passthrough
|
||||
--passthru
|
||||
--path-to-ignore
|
||||
--path-to-agignore
|
||||
--print-long-lines
|
||||
--print0
|
||||
--recurse
|
||||
|
|
@ -106,7 +106,7 @@ _ag() {
|
|||
--ignore-dir) # directory completion
|
||||
_filedir -d
|
||||
return 0;;
|
||||
--path-to-ignore) # file completion
|
||||
--path-to-agignore) # file completion
|
||||
_filedir
|
||||
return 0;;
|
||||
--pager) # command completion
|
||||
|
|
|
|||
21
autogen.sh
21
autogen.sh
|
|
@ -1,21 +0,0 @@
|
|||
#!/bin/sh
|
||||
|
||||
set -e
|
||||
cd "$(dirname "$0")"
|
||||
|
||||
AC_SEARCH_OPTS=""
|
||||
# For those of us with pkg-config and other tools in /usr/local
|
||||
PATH=$PATH:/usr/local/bin
|
||||
|
||||
# This is to make life easier for people who installed pkg-config in /usr/local
|
||||
# but have autoconf/make/etc in /usr/. AKA most mac users
|
||||
if [ -d "/usr/local/share/aclocal" ]
|
||||
then
|
||||
AC_SEARCH_OPTS="-I /usr/local/share/aclocal"
|
||||
fi
|
||||
|
||||
# shellcheck disable=2086
|
||||
aclocal $AC_SEARCH_OPTS
|
||||
autoconf
|
||||
autoheader
|
||||
automake --add-missing
|
||||
22
build.sh
22
build.sh
|
|
@ -1,8 +1,22 @@
|
|||
#!/bin/sh
|
||||
|
||||
set -e
|
||||
cd "$(dirname "$0")"
|
||||
cd "$(dirname "$0")" || exit 1
|
||||
|
||||
./autogen.sh
|
||||
./configure "$@"
|
||||
AC_SEARCH_OPTS=""
|
||||
# For those of us with pkg-config and other tools in /usr/local
|
||||
PATH=$PATH:/usr/local/bin
|
||||
|
||||
# This is to make life easier for people who installed pkg-config in /usr/local
|
||||
# but have autoconf/make/etc in /usr/. AKA most mac users
|
||||
if [ -d "/usr/local/share/aclocal" ]
|
||||
then
|
||||
AC_SEARCH_OPTS="-I /usr/local/share/aclocal"
|
||||
fi
|
||||
|
||||
# shellcheck disable=2086
|
||||
aclocal $AC_SEARCH_OPTS && \
|
||||
autoconf && \
|
||||
autoheader && \
|
||||
automake --add-missing && \
|
||||
./configure "$@" && \
|
||||
make -j4
|
||||
|
|
|
|||
18
configure.ac
18
configure.ac
|
|
@ -1,6 +1,6 @@
|
|||
AC_INIT(
|
||||
[the_silver_searcher],
|
||||
[2.2.0],
|
||||
[1.0.1],
|
||||
[https://github.com/ggreer/the_silver_searcher/issues],
|
||||
[the_silver_searcher],
|
||||
[https://github.com/ggreer/the_silver_searcher])
|
||||
|
|
@ -10,13 +10,13 @@ AM_INIT_AUTOMAKE([no-define foreign subdir-objects])
|
|||
AC_PROG_CC
|
||||
AM_PROG_CC_C_O
|
||||
AC_PREREQ([2.59])
|
||||
AC_PROG_GREP
|
||||
|
||||
m4_ifdef(
|
||||
[AM_SILENT_RULES],
|
||||
[AM_SILENT_RULES([yes])])
|
||||
|
||||
PKG_CHECK_MODULES([PCRE], [libpcre])
|
||||
PKG_CHECK_MODULES([PCRE2], [libpcre2-8])
|
||||
AC_DEFINE([PCRE2_CODE_UNIT_WIDTH], [8], [Use utf8])
|
||||
|
||||
m4_include([m4/ax_pthread.m4])
|
||||
AX_PTHREAD(
|
||||
|
|
@ -25,12 +25,7 @@ AX_PTHREAD(
|
|||
)
|
||||
|
||||
# Run CFLAGS="-pg" ./configure if you want debug symbols
|
||||
if ! echo "$CFLAGS" | "$GREP" '\(^\|[[[:space:]]]\)-O' > /dev/null; then
|
||||
CFLAGS="$CFLAGS -O2"
|
||||
fi
|
||||
|
||||
CFLAGS="$CFLAGS $PTHREAD_CFLAGS $PCRE_CFLAGS -Wall -Wextra -Wformat=2 -Wno-format-nonliteral -Wshadow"
|
||||
CFLAGS="$CFLAGS -Wpointer-arith -Wcast-qual -Wmissing-prototypes -Wno-missing-braces -std=gnu89 -D_GNU_SOURCE"
|
||||
CFLAGS="$CFLAGS $PTHREAD_CFLAGS $PCRE2_CFLAGS -Wall -Wextra -Wformat=2 -Wno-format-nonliteral -Wshadow -Wpointer-arith -Wcast-qual -Wmissing-prototypes -Wno-missing-braces -std=gnu89 -D_GNU_SOURCE -O2"
|
||||
LDFLAGS="$LDFLAGS"
|
||||
|
||||
case $host in
|
||||
|
|
@ -56,15 +51,14 @@ AS_IF([test "x$enable_lzma" != "xno"], [
|
|||
PKG_CHECK_MODULES([LZMA], [liblzma])
|
||||
])
|
||||
|
||||
AC_CHECK_DECL([PCRE_CONFIG_JIT], [AC_DEFINE([USE_PCRE_JIT], [], [Use PCRE JIT])], [], [#include <pcre.h>])
|
||||
AC_CHECK_DECL([PCRE2_CONFIG_JIT], [AC_DEFINE([USE_PCRE2_JIT], [], [Use PCRE2 JIT])], [], [#include <pcre2.h>])
|
||||
|
||||
AC_CHECK_DECL([CPU_ZERO, CPU_SET], [AC_DEFINE([USE_CPU_SET], [], [Use CPU_SET macros])] , [], [#include <sched.h>])
|
||||
AC_CHECK_HEADERS([sys/cpuset.h err.h])
|
||||
|
||||
AC_CHECK_MEMBER([struct dirent.d_type], [AC_DEFINE([HAVE_DIRENT_DTYPE], [], [Have dirent struct member d_type])], [], [[#include <dirent.h>]])
|
||||
AC_CHECK_MEMBER([struct dirent.d_namlen], [AC_DEFINE([HAVE_DIRENT_DNAMLEN], [], [Have dirent struct member d_namlen])], [], [[#include <dirent.h>]])
|
||||
|
||||
AC_CHECK_FUNCS(fgetln fopencookie getline realpath strlcpy strndup vasprintf madvise posix_fadvise pthread_setaffinity_np pledge)
|
||||
AC_CHECK_FUNCS(fgetln getline realpath strlcpy strndup vasprintf madvise posix_fadvise pthread_setaffinity_np pledge)
|
||||
|
||||
AC_CONFIG_FILES([Makefile the_silver_searcher.spec])
|
||||
AC_CONFIG_HEADERS([src/config.h])
|
||||
|
|
|
|||
4
doc/ag.1
4
doc/ag.1
|
|
@ -1,7 +1,7 @@
|
|||
.\" generated with Ronn/v0.7.3
|
||||
.\" http://github.com/rtomayko/ronn/tree/0.7.3
|
||||
.
|
||||
.TH "AG" "1" "December 2016" "" ""
|
||||
.TH "AG" "1" "November 2016" "" ""
|
||||
.
|
||||
.SH "NAME"
|
||||
\fBag\fR \- The Silver Searcher\. Like ack, but faster\.
|
||||
|
|
@ -136,7 +136,7 @@ Skip the rest of a file after NUM matches\. Default is 0, which never skips\.
|
|||
.
|
||||
.TP
|
||||
\fB\-\-[no]mmap\fR
|
||||
Toggle use of memory\-mapped I/O\. Defaults to true on platforms where \fBmmap()\fR is faster than \fBread()\fR\. (All but macOS\.)
|
||||
Toggle use of memory\-mapped I/O\. Defaults to true\.
|
||||
.
|
||||
.TP
|
||||
\fB\-\-[no]multiline\fR
|
||||
|
|
|
|||
|
|
@ -109,8 +109,7 @@ Recursively search for PATTERN in PATH. Like grep or ack, but faster.
|
|||
Skip the rest of a file after NUM matches. Default is 0, which never skips.
|
||||
|
||||
* `--[no]mmap`:
|
||||
Toggle use of memory-mapped I/O. Defaults to true on platforms where
|
||||
`mmap()` is faster than `read()`. (All but macOS.)
|
||||
Toggle use of memory-mapped I/O. Defaults to true.
|
||||
|
||||
* `--[no]multiline`:
|
||||
Match regexes across newlines. Enabled by default.
|
||||
|
|
@ -207,9 +206,6 @@ Recursively search for PATTERN in PATH. Like grep or ack, but faster.
|
|||
* `--workers NUM`:
|
||||
Use NUM worker threads. Default is the number of CPU cores, with a max of 8.
|
||||
|
||||
* `-W --width NUM`:
|
||||
Truncate match lines after NUM characters.
|
||||
|
||||
* `-z --search-zip`:
|
||||
Search contents of compressed files. Currently, gz and xz are supported.
|
||||
This option requires that ag is built with lzma and zlib.
|
||||
|
|
|
|||
10
pgo.sh
10
pgo.sh
|
|
@ -1,10 +0,0 @@
|
|||
#!/bin/sh
|
||||
|
||||
set -e
|
||||
cd "$(dirname "$0")"
|
||||
|
||||
make clean
|
||||
./build.sh CFLAGS="$CFLAGS -fprofile-generate"
|
||||
./ag example ..
|
||||
make clean
|
||||
./build.sh CFLAGS="$CFLAGS -fprofile-correction -fprofile-use"
|
||||
196
sanitize.sh
196
sanitize.sh
|
|
@ -1,196 +0,0 @@
|
|||
#!/bin/bash
|
||||
# Copyright 2016 Allen Wild
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
AVAILABLE_SANITIZERS=(
|
||||
address
|
||||
thread
|
||||
undefined
|
||||
valgrind
|
||||
)
|
||||
|
||||
DEFAULT_SANITIZERS=(
|
||||
address
|
||||
thread
|
||||
undefined
|
||||
)
|
||||
|
||||
usage() {
|
||||
cat <<EOF
|
||||
Usage: $0 [-h] [valgrind | [SANITIZERS ...]]
|
||||
|
||||
This script recompiles ag using -fsanitize=<SANITIZER> and then runs the test suite.
|
||||
Memory leaks or other errors will be printed in ag's output, thus failing the test.
|
||||
|
||||
Available LLVM sanitizers are: ${AVAILABLE_SANITIZERS[*]}
|
||||
|
||||
The compile-time sanitizers are supported in clang/llvm >= 3.1 and gcc >= 4.8
|
||||
for x86_64 Linux only. clang is preferred and will be used, if available.
|
||||
|
||||
For function names and line numbers in error output traces, llvm-symbolizer needs
|
||||
to be available in PATH or set through ASAN_SYMBOLIZER_PATH.
|
||||
|
||||
If 'valgrind' is passed as the sanitizer, then ag will be run through valgrind
|
||||
without recompiling. If $(dirname $0)/ag doesn't exist, then it will be built.
|
||||
|
||||
WARNING: This script will run "make distclean" and "./configure" to recompile ag
|
||||
once per sanitizer (except for valgrind). If you need to pass additional
|
||||
options to ./configure, put them in the CONFIGOPTS environment variable.
|
||||
EOF
|
||||
}
|
||||
|
||||
vrun() {
|
||||
echo "Running: $*"
|
||||
"$@"
|
||||
}
|
||||
|
||||
die() {
|
||||
echo "Fatal: $*"
|
||||
exit 1
|
||||
}
|
||||
|
||||
valid_sanitizer() {
|
||||
for san in "${AVAILABLE_SANITIZERS[@]}"; do
|
||||
if [[ "$1" == "$san" ]]; then
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
run_sanitizer() {
|
||||
sanitizer=$1
|
||||
if [[ "$sanitizer" == "valgrind" ]]; then
|
||||
run_valgrind
|
||||
return $?
|
||||
fi
|
||||
|
||||
echo -e "\nCompiling for sanitizer '$sanitizer'"
|
||||
[[ -f Makefile ]] && vrun make distclean
|
||||
vrun ./configure $CONFIGOPTS CC=$SANITIZE_CC \
|
||||
CFLAGS="-g -O0 -fsanitize=$sanitizer $EXTRA_CFLAGS"
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "ERROR: Failed to configure. Try setting CONFIGOPTS?"
|
||||
return 1
|
||||
fi
|
||||
|
||||
vrun make
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "ERROR: failed to build"
|
||||
return 1
|
||||
fi
|
||||
|
||||
echo "Testing with sanitizer '$sanitizer'"
|
||||
vrun make test
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "Tests for sanitizer '$sanitizer' FAIL!"
|
||||
echo "Check the above output for failure information"
|
||||
return 2
|
||||
else
|
||||
echo "Tests for sanitizer '$sanitizer' PASS!"
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
run_valgrind() {
|
||||
echo "Compiling ag normally for use with valgrind"
|
||||
[[ -f Makefile ]] && vrun make distclean
|
||||
vrun ./configure $CONFIGOPTS
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "ERROR: Failed to configure. Try setting CONFIGOPTS?"
|
||||
return 1
|
||||
fi
|
||||
|
||||
vrun make
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "ERROR: failed to build"
|
||||
return 1
|
||||
fi
|
||||
|
||||
echo "Running: AGPROG=\"valgrind -q $PWD/ag\" make test"
|
||||
AGPROG="valgrind -q $PWD/ag" make test
|
||||
if [[ $? != 0 ]]; then
|
||||
echo "Valgrind tests FAIL!"
|
||||
return 1
|
||||
else
|
||||
echo "Valgrind tests PASS!"
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
#### MAIN ####
|
||||
run_sanitizers=()
|
||||
for opt in "$@"; do
|
||||
if [[ "$opt" == -* ]]; then
|
||||
case opt in
|
||||
-h|--help)
|
||||
usage
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: '$opt'"
|
||||
usage
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
else
|
||||
if valid_sanitizer "$opt"; then
|
||||
run_sanitizers+=("$opt")
|
||||
else
|
||||
echo "Invalid Sanitizer: '$opt'"
|
||||
usage
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ ${#run_sanitizers[@]} == 0 ]]; then
|
||||
run_sanitizers=(${DEFAULT_SANITIZERS[@]})
|
||||
fi
|
||||
|
||||
if [[ -n $CC ]]; then
|
||||
echo "Using CC=$CC"
|
||||
SANITIZE_CC="$CC"
|
||||
elif which clang &>/dev/null; then
|
||||
SANITIZE_CC="clang"
|
||||
else
|
||||
echo "Warning: CC unset and clang not found"
|
||||
fi
|
||||
|
||||
if [[ -n $CFLAGS ]]; then
|
||||
EXTRA_CFLAGS="$CFLAGS"
|
||||
unset CFLAGS
|
||||
fi
|
||||
|
||||
if [[ ! -e ./configure ]]; then
|
||||
echo "Warning: ./configure not found. Running autogen"
|
||||
vrun ./autogen.sh || die "autogen.sh failed"
|
||||
fi
|
||||
|
||||
echo "Running sanitizers: ${run_sanitizers[*]}"
|
||||
failedsan=()
|
||||
for san in "${run_sanitizers[@]}"; do
|
||||
run_sanitizer $san
|
||||
if [[ $? != 0 ]]; then
|
||||
failedsan+=($san)
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ ${#failedsan[@]} == 0 ]]; then
|
||||
echo "All sanitizers PASSED"
|
||||
exit 0
|
||||
else
|
||||
echo "The following sanitizers FAILED: ${failedsan[*]}"
|
||||
exit ${#failedsan[@]}
|
||||
fi
|
||||
|
|
@ -1,8 +1,6 @@
|
|||
#ifndef DECOMPRESS_H
|
||||
#define DECOMPRESS_H
|
||||
|
||||
#include <stdio.h>
|
||||
|
||||
#include "config.h"
|
||||
#include "log.h"
|
||||
#include "options.h"
|
||||
|
|
@ -18,9 +16,4 @@ typedef enum {
|
|||
ag_compression_type is_zipped(const void *buf, const int buf_len);
|
||||
|
||||
void *decompress(const ag_compression_type zip_type, const void *buf, const int buf_len, const char *dir_full_path, int *new_buf_len);
|
||||
|
||||
#if HAVE_FOPENCOOKIE
|
||||
FILE *decompress_open(int fd, const char *mode, ag_compression_type ctype);
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
|
|
|||
44
src/ignore.c
44
src/ignore.c
|
|
@ -20,8 +20,6 @@
|
|||
const int fnmatch_flags = FNM_PATHNAME;
|
||||
#endif
|
||||
|
||||
ignores *root_ignores;
|
||||
|
||||
/* TODO: build a huge-ass list of files we want to ignore by default (build cache stuff, pyc files, etc) */
|
||||
|
||||
const char *evil_hardcoded_ignore_files[] = {
|
||||
|
|
@ -32,6 +30,8 @@ const char *evil_hardcoded_ignore_files[] = {
|
|||
|
||||
/* Warning: changing the first two strings will break skip_vcs_ignores. */
|
||||
const char *ignore_pattern_files[] = {
|
||||
/* Warning: .agignore will one day be removed in favor of .ignore */
|
||||
".agignore",
|
||||
".ignore",
|
||||
".gitignore",
|
||||
".git/info/exclude",
|
||||
|
|
@ -53,8 +53,6 @@ ignores *init_ignore(ignores *parent, const char *dirname, const size_t dirname_
|
|||
ig->slash_names_len = 0;
|
||||
ig->regexes = NULL;
|
||||
ig->regexes_len = 0;
|
||||
ig->invert_regexes = NULL;
|
||||
ig->invert_regexes_len = 0;
|
||||
ig->slash_regexes = NULL;
|
||||
ig->slash_regexes_len = 0;
|
||||
ig->dirname = dirname;
|
||||
|
|
@ -88,7 +86,6 @@ void cleanup_ignore(ignores *ig) {
|
|||
free_strings(ig->names, ig->names_len);
|
||||
free_strings(ig->slash_names, ig->slash_names_len);
|
||||
free_strings(ig->regexes, ig->regexes_len);
|
||||
free_strings(ig->invert_regexes, ig->invert_regexes_len);
|
||||
free_strings(ig->slash_regexes, ig->slash_regexes_len);
|
||||
if (ig->abs_path) {
|
||||
free(ig->abs_path);
|
||||
|
|
@ -120,21 +117,15 @@ void add_ignore_pattern(ignores *ig, const char *pattern) {
|
|||
char ***patterns_p;
|
||||
size_t *patterns_len;
|
||||
if (is_fnmatch(pattern)) {
|
||||
if (pattern[0] == '*' && pattern[1] == '.' && strchr(pattern + 2, '.') && !is_fnmatch(pattern + 2)) {
|
||||
if (pattern[0] == '*' && pattern[1] == '.' && !(is_fnmatch(pattern + 2))) {
|
||||
patterns_p = &(ig->extensions);
|
||||
patterns_len = &(ig->extensions_len);
|
||||
pattern += 2;
|
||||
pattern_len -= 2;
|
||||
} else if (pattern[0] == '/') {
|
||||
patterns_p = &(ig->slash_regexes);
|
||||
patterns_len = &(ig->slash_regexes_len);
|
||||
pattern++;
|
||||
pattern_len--;
|
||||
} else if (pattern[0] == '!') {
|
||||
patterns_p = &(ig->invert_regexes);
|
||||
patterns_len = &(ig->invert_regexes_len);
|
||||
pattern++;
|
||||
pattern_len--;
|
||||
} else {
|
||||
patterns_p = &(ig->regexes);
|
||||
patterns_len = &(ig->regexes_len);
|
||||
|
|
@ -202,13 +193,12 @@ static int ackmate_dir_match(const char *dir_name) {
|
|||
return 0;
|
||||
}
|
||||
/* we just care about the match, not where the matches are */
|
||||
return pcre_exec(opts.ackmate_dir_filter, NULL, dir_name, strlen(dir_name), 0, 0, NULL, 0);
|
||||
return pcre2_match(opts.ackmate_dir_filter, dir_name, strlen(dir_name), 0, 0, NULL, NULL);
|
||||
}
|
||||
|
||||
/* This is the hottest code in Ag. 10-15% of all execution time is spent here */
|
||||
static int path_ignore_search(const ignores *ig, const char *path, const char *filename) {
|
||||
char *temp;
|
||||
int temp_start_pos;
|
||||
size_t i;
|
||||
int match_pos;
|
||||
|
||||
|
|
@ -219,12 +209,9 @@ static int path_ignore_search(const ignores *ig, const char *path, const char *f
|
|||
}
|
||||
|
||||
ag_asprintf(&temp, "%s/%s", path[0] == '.' ? path + 1 : path, filename);
|
||||
//ig->abs_path has its leading slash stripped, so we have to strip the leading slash
|
||||
//of temp as well
|
||||
temp_start_pos = (temp[0] == '/') ? 1 : 0;
|
||||
|
||||
if (strncmp(temp + temp_start_pos, ig->abs_path, ig->abs_path_len) == 0) {
|
||||
char *slash_filename = temp + temp_start_pos + ig->abs_path_len;
|
||||
if (strncmp(temp, ig->abs_path, ig->abs_path_len) == 0) {
|
||||
char *slash_filename = temp + ig->abs_path_len;
|
||||
if (slash_filename[0] == '/') {
|
||||
slash_filename++;
|
||||
}
|
||||
|
|
@ -265,15 +252,6 @@ static int path_ignore_search(const ignores *ig, const char *path, const char *f
|
|||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < ig->invert_regexes_len; i++) {
|
||||
if (fnmatch(ig->invert_regexes[i], filename, fnmatch_flags) == 0) {
|
||||
log_debug("file %s not ignored because name matches regex pattern !%s", filename, ig->invert_regexes[i]);
|
||||
free(temp);
|
||||
return 0;
|
||||
}
|
||||
log_debug("pattern !%s doesn't match file %s", ig->invert_regexes[i], filename);
|
||||
}
|
||||
|
||||
for (i = 0; i < ig->regexes_len; i++) {
|
||||
if (fnmatch(ig->regexes[i], filename, fnmatch_flags) == 0) {
|
||||
log_debug("file %s ignored because name matches regex pattern %s", filename, ig->regexes[i]);
|
||||
|
|
@ -317,7 +295,15 @@ int filename_filter(const char *path, const struct dirent *dir, void *baton) {
|
|||
}
|
||||
|
||||
scandir_baton_t *scandir_baton = (scandir_baton_t *)baton;
|
||||
const char *path_start = scandir_baton->path_start;
|
||||
const char *base_path = scandir_baton->base_path;
|
||||
const size_t base_path_len = scandir_baton->base_path_len;
|
||||
const char *path_start = path;
|
||||
|
||||
for (i = 0; base_path[i] == path[i] && i < base_path_len; i++) {
|
||||
/* base_path always ends with "/\0" while path doesn't, so this is safe */
|
||||
path_start = path + i + 2;
|
||||
}
|
||||
log_debug("path_start %s filename %s", path_start, filename);
|
||||
|
||||
const char *extension = strchr(filename, '.');
|
||||
if (extension) {
|
||||
|
|
|
|||
|
|
@ -15,8 +15,6 @@ struct ignores {
|
|||
|
||||
char **regexes; /* For patterns that need fnmatch */
|
||||
size_t regexes_len;
|
||||
char **invert_regexes; /* For "!" patterns */
|
||||
size_t invert_regexes_len;
|
||||
char **slash_regexes;
|
||||
size_t slash_regexes_len;
|
||||
|
||||
|
|
@ -29,7 +27,7 @@ struct ignores {
|
|||
};
|
||||
typedef struct ignores ignores;
|
||||
|
||||
extern ignores *root_ignores;
|
||||
ignores *root_ignores;
|
||||
|
||||
extern const char *evil_hardcoded_ignore_files[];
|
||||
extern const char *ignore_pattern_files[];
|
||||
|
|
|
|||
60
src/lang.c
60
src/lang.c
|
|
@ -7,67 +7,47 @@
|
|||
lang_spec_t langs[] = {
|
||||
{ "actionscript", { "as", "mxml" } },
|
||||
{ "ada", { "ada", "adb", "ads" } },
|
||||
{ "asciidoc", { "adoc", "ad", "asc", "asciidoc" } },
|
||||
{ "apl", { "apl" } },
|
||||
{ "asm", { "asm", "s" } },
|
||||
{ "asp", { "asp", "asa", "aspx", "asax", "ashx", "ascx", "asmx" } },
|
||||
{ "aspx", { "asp", "asa", "aspx", "asax", "ashx", "ascx", "asmx" } },
|
||||
{ "batch", { "bat", "cmd" } },
|
||||
{ "bazel", { "bazel" } },
|
||||
{ "bitbake", { "bb", "bbappend", "bbclass", "inc" } },
|
||||
{ "bro", { "bro", "bif" } },
|
||||
{ "cc", { "c", "h", "xs" } },
|
||||
{ "cfmx", { "cfc", "cfm", "cfml" } },
|
||||
{ "chpl", { "chpl" } },
|
||||
{ "clojure", { "clj", "cljs", "cljc", "cljx", "edn" } },
|
||||
{ "clojure", { "clj", "cljs", "cljc", "cljx" } },
|
||||
{ "coffee", { "coffee", "cjsx" } },
|
||||
{ "config", { "config" } },
|
||||
{ "coq", { "coq", "g", "v" } },
|
||||
{ "cpp", { "cpp", "cc", "C", "cxx", "m", "hpp", "hh", "h", "H", "hxx", "tpp" } },
|
||||
{ "crystal", { "cr", "ecr" } },
|
||||
{ "csharp", { "cs" } },
|
||||
{ "cshtml", { "cshtml" } },
|
||||
{ "css", { "css" } },
|
||||
{ "cython", { "pyx", "pxd", "pxi" } },
|
||||
{ "delphi", { "pas", "int", "dfm", "nfm", "dof", "dpk", "dpr", "dproj", "groupproj", "bdsgroup", "bdsproj" } },
|
||||
{ "dlang", { "d", "di" } },
|
||||
{ "dot", { "dot", "gv" } },
|
||||
{ "dts", { "dts", "dtsi" } },
|
||||
{ "ebuild", { "ebuild", "eclass" } },
|
||||
{ "elisp", { "el" } },
|
||||
{ "elixir", { "ex", "eex", "exs" } },
|
||||
{ "elm", { "elm" } },
|
||||
{ "erlang", { "erl", "hrl" } },
|
||||
{ "factor", { "factor" } },
|
||||
{ "fortran", { "f", "F", "f77", "f90", "F90", "f95", "f03", "for", "ftn", "fpp", "FPP" } },
|
||||
{ "fortran", { "f", "f77", "f90", "f95", "f03", "for", "ftn", "fpp" } },
|
||||
{ "fsharp", { "fs", "fsi", "fsx" } },
|
||||
{ "gettext", { "po", "pot", "mo" } },
|
||||
{ "glsl", { "vert", "tesc", "tese", "geom", "frag", "comp" } },
|
||||
{ "go", { "go" } },
|
||||
{ "gradle", { "gradle" } },
|
||||
{ "groovy", { "groovy", "gtmpl", "gpp", "grunit", "gradle" } },
|
||||
{ "groovy", { "groovy", "gtmpl", "gpp", "grunit" } },
|
||||
{ "haml", { "haml" } },
|
||||
{ "handlebars", { "hbs" } },
|
||||
{ "haskell", { "hs", "hsig", "lhs" } },
|
||||
{ "haxe", { "hx" } },
|
||||
{ "haskell", { "hs", "lhs" } },
|
||||
{ "hh", { "h" } },
|
||||
{ "html", { "htm", "html", "shtml", "xhtml" } },
|
||||
{ "idris", { "idr", "ipkg", "lidr" } },
|
||||
{ "ini", { "ini" } },
|
||||
{ "ipython", { "ipynb" } },
|
||||
{ "isabelle", { "thy" } },
|
||||
{ "j", { "ijs" } },
|
||||
{ "jade", { "jade" } },
|
||||
{ "java", { "java", "properties" } },
|
||||
{ "jinja2", { "j2" } },
|
||||
{ "js", { "es6", "js", "jsx", "vue" } },
|
||||
{ "js", { "js", "jsx", "vue" } },
|
||||
{ "json", { "json" } },
|
||||
{ "jsp", { "jsp", "jspx", "jhtm", "jhtml", "jspf", "tag", "tagf" } },
|
||||
{ "jsp", { "jsp", "jspx", "jhtm", "jhtml" } },
|
||||
{ "julia", { "jl" } },
|
||||
{ "kotlin", { "kt" } },
|
||||
{ "less", { "less" } },
|
||||
{ "liquid", { "liquid" } },
|
||||
{ "lisp", { "lisp", "lsp" } },
|
||||
{ "log", { "log" } },
|
||||
{ "lua", { "lua" } },
|
||||
{ "m4", { "m4" } },
|
||||
{ "make", { "Makefiles", "mk", "mak" } },
|
||||
|
|
@ -76,36 +56,26 @@ lang_spec_t langs[] = {
|
|||
{ "mason", { "mas", "mhtml", "mpl", "mtxt" } },
|
||||
{ "matlab", { "m" } },
|
||||
{ "mathematica", { "m", "wl" } },
|
||||
{ "md", { "markdown", "mdown", "mdwn", "mkdn", "mkd", "md" } },
|
||||
{ "mercury", { "m", "moo" } },
|
||||
{ "naccess", { "asa", "rsa" } },
|
||||
{ "nim", { "nim" } },
|
||||
{ "nix", { "nix" } },
|
||||
{ "objc", { "m", "h" } },
|
||||
{ "objcpp", { "mm", "h" } },
|
||||
{ "ocaml", { "ml", "mli", "mll", "mly" } },
|
||||
{ "octave", { "m" } },
|
||||
{ "org", { "org" } },
|
||||
{ "parrot", { "pir", "pasm", "pmc", "ops", "pod", "pg", "tg" } },
|
||||
{ "pdb", { "pdb" } },
|
||||
{ "perl", { "pl", "pm", "pm6", "pod", "t" } },
|
||||
{ "php", { "php", "phpt", "php3", "php4", "php5", "phtml" } },
|
||||
{ "pike", { "pike", "pmod" } },
|
||||
{ "plist", { "plist" } },
|
||||
{ "plone", { "pt", "cpt", "metadata", "cpy", "py", "xml", "zcml" } },
|
||||
{ "powershell", { "ps1" } },
|
||||
{ "proto", { "proto" } },
|
||||
{ "ps1", { "ps1" } },
|
||||
{ "pug", { "pug" } },
|
||||
{ "puppet", { "pp" } },
|
||||
{ "python", { "py" } },
|
||||
{ "qml", { "qml" } },
|
||||
{ "racket", { "rkt", "ss", "scm" } },
|
||||
{ "rake", { "Rakefile" } },
|
||||
{ "razor", { "cshtml" } },
|
||||
{ "restructuredtext", { "rst" } },
|
||||
{ "rs", { "rs" } },
|
||||
{ "r", { "r", "R", "Rmd", "Rnw", "Rtex", "Rrst" } },
|
||||
{ "r", { "R", "Rmd", "Rnw", "Rtex", "Rrst" } },
|
||||
{ "rdoc", { "rdoc" } },
|
||||
{ "ruby", { "rb", "rhtml", "rjs", "rxml", "erb", "rake", "spec" } },
|
||||
{ "rust", { "rs" } },
|
||||
|
|
@ -117,32 +87,24 @@ lang_spec_t langs[] = {
|
|||
{ "smalltalk", { "st" } },
|
||||
{ "sml", { "sml", "fun", "mlb", "sig" } },
|
||||
{ "sql", { "sql", "ctl" } },
|
||||
{ "stata", { "do", "ado" } },
|
||||
{ "stylus", { "styl" } },
|
||||
{ "swift", { "swift" } },
|
||||
{ "tcl", { "tcl", "itcl", "itk" } },
|
||||
{ "terraform", { "tf", "tfvars" } },
|
||||
{ "tex", { "tex", "cls", "sty" } },
|
||||
{ "thrift", { "thrift" } },
|
||||
{ "tla", { "tla" } },
|
||||
{ "tt", { "tt", "tt2", "ttml" } },
|
||||
{ "toml", { "toml" } },
|
||||
{ "ts", { "ts", "tsx" } },
|
||||
{ "twig", { "twig" } },
|
||||
{ "vala", { "vala", "vapi" } },
|
||||
{ "vb", { "bas", "cls", "frm", "ctl", "vb", "resx" } },
|
||||
{ "velocity", { "vm", "vtl", "vsl" } },
|
||||
{ "verilog", { "v", "vh", "sv", "svh" } },
|
||||
{ "verilog", { "v", "vh", "sv" } },
|
||||
{ "vhdl", { "vhd", "vhdl" } },
|
||||
{ "vim", { "vim" } },
|
||||
{ "vue", { "vue" } },
|
||||
{ "wix", { "wxi", "wxs" } },
|
||||
{ "wsdl", { "wsdl" } },
|
||||
{ "wadl", { "wadl" } },
|
||||
{ "xml", { "xml", "dtd", "xsl", "xslt", "xsd", "ent", "tld", "plist", "wsdl" } },
|
||||
{ "yaml", { "yaml", "yml" } },
|
||||
{ "zeek", { "zeek", "bro", "bif" } },
|
||||
{ "zephir", { "zep" } }
|
||||
{ "xml", { "xml", "dtd", "xsl", "xslt", "ent", "tld" } },
|
||||
{ "yaml", { "yaml", "yml" } }
|
||||
};
|
||||
|
||||
size_t get_lang_count() {
|
||||
|
|
|
|||
|
|
@ -4,7 +4,6 @@
|
|||
#include "log.h"
|
||||
#include "util.h"
|
||||
|
||||
pthread_mutex_t print_mtx = PTHREAD_MUTEX_INITIALIZER;
|
||||
static enum log_level log_threshold = LOG_LEVEL_ERR;
|
||||
|
||||
void set_log_level(enum log_level threshold) {
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@
|
|||
#include <pthread.h>
|
||||
#endif
|
||||
|
||||
extern pthread_mutex_t print_mtx;
|
||||
pthread_mutex_t print_mtx;
|
||||
|
||||
enum log_level {
|
||||
LOG_LEVEL_DEBUG = 10,
|
||||
|
|
|
|||
49
src/main.c
49
src/main.c
|
|
@ -1,5 +1,7 @@
|
|||
#include "config.h"
|
||||
|
||||
#include <ctype.h>
|
||||
#include <pcre.h>
|
||||
#include <pcre2.h>
|
||||
#include <stdarg.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
|
|
@ -9,20 +11,10 @@
|
|||
#include <windows.h>
|
||||
#endif
|
||||
|
||||
#include "config.h"
|
||||
|
||||
#ifdef HAVE_SYS_CPUSET_H
|
||||
#include <sys/cpuset.h>
|
||||
#endif
|
||||
|
||||
#ifdef HAVE_PTHREAD_H
|
||||
#include <pthread.h>
|
||||
#endif
|
||||
|
||||
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && defined(__FreeBSD__)
|
||||
#include <pthread_np.h>
|
||||
#endif
|
||||
|
||||
#include "log.h"
|
||||
#include "options.h"
|
||||
#include "search.h"
|
||||
|
|
@ -37,7 +29,7 @@ int main(int argc, char **argv) {
|
|||
char **base_paths = NULL;
|
||||
char **paths = NULL;
|
||||
int i;
|
||||
int pcre_opts = PCRE_MULTILINE;
|
||||
int pcre_opts = PCRE2_MULTILINE;
|
||||
int study_opts = 0;
|
||||
worker_t *workers = NULL;
|
||||
int workers_len;
|
||||
|
|
@ -57,19 +49,22 @@ int main(int argc, char **argv) {
|
|||
out_fd = stdout;
|
||||
|
||||
parse_options(argc, argv, &base_paths, &paths);
|
||||
log_debug("PCRE Version: %s", pcre_version());
|
||||
log_debug("PCRE Version: %s", pcre2_version());
|
||||
if (opts.stats) {
|
||||
memset(&stats, 0, sizeof(stats));
|
||||
gettimeofday(&(stats.time_start), NULL);
|
||||
}
|
||||
|
||||
#ifdef USE_PCRE_JIT
|
||||
int has_jit = 0;
|
||||
pcre_config(PCRE_CONFIG_JIT, &has_jit);
|
||||
if (has_jit) {
|
||||
study_opts |= PCRE_STUDY_JIT_COMPILE;
|
||||
}
|
||||
#endif
|
||||
/*
|
||||
TODO: call pcre2_jit_compile in compile_study
|
||||
// #ifdef USE_PCRE2_JIT
|
||||
// int has_jit = 0;
|
||||
// pcre2_config(PCRE2_CONFIG_JIT, &has_jit);
|
||||
// if (has_jit) {
|
||||
// study_opts |= PCRE2_STUDY_JIT_COMPILE;
|
||||
// }
|
||||
// #endif
|
||||
*/
|
||||
|
||||
#ifdef _WIN32
|
||||
{
|
||||
|
|
@ -131,7 +126,7 @@ int main(int argc, char **argv) {
|
|||
}
|
||||
} else {
|
||||
if (opts.casing == CASE_INSENSITIVE) {
|
||||
pcre_opts |= PCRE_CASELESS;
|
||||
pcre_opts |= PCRE2_CASELESS;
|
||||
}
|
||||
if (opts.word_regexp) {
|
||||
char *word_regexp_query;
|
||||
|
|
@ -140,7 +135,7 @@ int main(int argc, char **argv) {
|
|||
opts.query = word_regexp_query;
|
||||
opts.query_len = strlen(opts.query);
|
||||
}
|
||||
compile_study(&opts.re, &opts.re_extra, opts.query, pcre_opts, study_opts);
|
||||
compile_study(&opts.re, &opts.re_ctx, opts.query, pcre_opts, study_opts);
|
||||
}
|
||||
|
||||
if (opts.search_stream) {
|
||||
|
|
@ -152,13 +147,9 @@ int main(int argc, char **argv) {
|
|||
if (rv != 0) {
|
||||
die("Error in pthread_create(): %s", strerror(rv));
|
||||
}
|
||||
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && (defined(USE_CPU_SET) || defined(HAVE_SYS_CPUSET_H))
|
||||
#if defined(HAVE_PTHREAD_SETAFFINITY_NP) && defined(USE_CPU_SET)
|
||||
if (opts.use_thread_affinity) {
|
||||
#if defined(__linux__) || defined(__midipix__)
|
||||
cpu_set_t cpu_set;
|
||||
#elif __FreeBSD__
|
||||
cpuset_t cpu_set;
|
||||
#endif
|
||||
CPU_ZERO(&cpu_set);
|
||||
CPU_SET(i % num_cores, &cpu_set);
|
||||
rv = pthread_setaffinity_np(workers[i].thread, sizeof(cpu_set), &cpu_set);
|
||||
|
|
@ -185,7 +176,7 @@ int main(int argc, char **argv) {
|
|||
log_debug("searching path %s for %s", paths[i], opts.query);
|
||||
symhash = NULL;
|
||||
ignores *ig = init_ignore(root_ignores, "", 0);
|
||||
struct stat s = { .st_dev = 0 };
|
||||
struct stat s = {.st_dev = 0 };
|
||||
#ifndef _WIN32
|
||||
/* The device is ignored if opts.one_dev is false, so it's fine
|
||||
* to leave it at the default 0
|
||||
|
|
@ -213,7 +204,7 @@ int main(int argc, char **argv) {
|
|||
double time_diff = ((long)stats.time_end.tv_sec * 1000000 + stats.time_end.tv_usec) -
|
||||
((long)stats.time_start.tv_sec * 1000000 + stats.time_start.tv_usec);
|
||||
time_diff /= 1000000;
|
||||
printf("%zu matches\n%zu files contained matches\n%zu files searched\n%zu bytes searched\n%f seconds\n",
|
||||
printf("%ld matches\n%ld files contained matches\n%ld files searched\n%ld bytes searched\n%f seconds\n",
|
||||
stats.total_matches, stats.total_file_matches, stats.total_files, stats.total_bytes, time_diff);
|
||||
pthread_mutex_destroy(&stats_mtx);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,3 +1,5 @@
|
|||
#include "config.h"
|
||||
|
||||
#include <errno.h>
|
||||
#include <limits.h>
|
||||
#include <stdarg.h>
|
||||
|
|
@ -8,7 +10,6 @@
|
|||
#include <sys/stat.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include "config.h"
|
||||
#include "ignore.h"
|
||||
#include "lang.h"
|
||||
#include "log.h"
|
||||
|
|
@ -20,8 +21,6 @@ const char *color_line_number = "\033[1;33m"; /* bold yellow */
|
|||
const char *color_match = "\033[30;43m"; /* black with yellow background */
|
||||
const char *color_path = "\033[1;32m"; /* bold green */
|
||||
|
||||
cli_options opts;
|
||||
|
||||
/* TODO: try to obey out_fd? */
|
||||
void usage(void) {
|
||||
printf("\n");
|
||||
|
|
@ -59,14 +58,11 @@ Output Options:\n\
|
|||
(Enabled by default)\n\
|
||||
-C --context [LINES] Print lines before and after matches (Default: 2)\n\
|
||||
--[no]group Same as --[no]break --[no]heading\n\
|
||||
-g --filename-pattern PATTERN\n\
|
||||
Print filenames matching PATTERN\n\
|
||||
-g PATTERN Print filenames matching PATTERN\n\
|
||||
-l --files-with-matches Only print filenames that contain matches\n\
|
||||
(don't print the matching lines)\n\
|
||||
-L --files-without-matches\n\
|
||||
Only print filenames that don't contain matches\n\
|
||||
--print-all-files Print headings for all files searched, even those that\n\
|
||||
don't contain matches\n\
|
||||
--[no]numbers Print line numbers. Default is to omit line numbers\n\
|
||||
when searching streams\n\
|
||||
-o --only-matching Prints only the matching part of the lines\n\
|
||||
|
|
@ -129,7 +125,7 @@ void print_version(void) {
|
|||
char lzma = '-';
|
||||
char zlib = '-';
|
||||
|
||||
#ifdef USE_PCRE_JIT
|
||||
#ifdef USE_PCRE2_JIT
|
||||
jit = '+';
|
||||
#endif
|
||||
#ifdef HAVE_LZMA_H
|
||||
|
|
@ -145,29 +141,18 @@ void print_version(void) {
|
|||
}
|
||||
|
||||
void init_options(void) {
|
||||
char *term = getenv("TERM");
|
||||
|
||||
memset(&opts, 0, sizeof(opts));
|
||||
opts.casing = CASE_DEFAULT;
|
||||
opts.color = TRUE;
|
||||
if (term && !strcmp(term, "dumb")) {
|
||||
opts.color = FALSE;
|
||||
}
|
||||
opts.color_win_ansi = FALSE;
|
||||
opts.max_matches_per_file = 0;
|
||||
opts.max_search_depth = DEFAULT_MAX_SEARCH_DEPTH;
|
||||
#if defined(__APPLE__) || defined(__MACH__)
|
||||
/* mamp() is slower than normal read() on macos. default to off */
|
||||
opts.mmap = FALSE;
|
||||
#else
|
||||
opts.mmap = TRUE;
|
||||
#endif
|
||||
opts.multiline = TRUE;
|
||||
opts.width = 0;
|
||||
opts.path_sep = '\n';
|
||||
opts.print_break = TRUE;
|
||||
opts.print_path = PATH_PRINT_DEFAULT;
|
||||
opts.print_all_paths = FALSE;
|
||||
opts.print_line_numbers = TRUE;
|
||||
opts.recurse_dirs = TRUE;
|
||||
opts.color_path = ag_strdup(color_path);
|
||||
|
|
@ -185,24 +170,23 @@ void cleanup_options(void) {
|
|||
free(opts.query);
|
||||
}
|
||||
|
||||
pcre_free(opts.re);
|
||||
if (opts.re_extra) {
|
||||
/* Using pcre_free_study on pcre_extra* can segfault on some versions of PCRE */
|
||||
pcre_free(opts.re_extra);
|
||||
pcre2_code_free(opts.re);
|
||||
if (opts.re_ctx) {
|
||||
pcre2_compile_context_free(opts.re_ctx);
|
||||
}
|
||||
|
||||
if (opts.ackmate_dir_filter) {
|
||||
pcre_free(opts.ackmate_dir_filter);
|
||||
pcre2_code_free(opts.ackmate_dir_filter);
|
||||
}
|
||||
if (opts.ackmate_dir_filter_extra) {
|
||||
pcre_free(opts.ackmate_dir_filter_extra);
|
||||
if (opts.ackmate_dir_filter_ctx) {
|
||||
pcre2_compile_context_free(opts.ackmate_dir_filter_ctx);
|
||||
}
|
||||
|
||||
if (opts.file_search_regex) {
|
||||
pcre_free(opts.file_search_regex);
|
||||
pcre2_code_free(opts.file_search_regex);
|
||||
}
|
||||
if (opts.file_search_regex_extra) {
|
||||
pcre_free(opts.file_search_regex_extra);
|
||||
if (opts.file_search_regex_ctx) {
|
||||
pcre2_compile_context_free(opts.file_search_regex_ctx);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -210,7 +194,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
int ch;
|
||||
size_t i;
|
||||
int path_len = 0;
|
||||
int base_path_len = 0;
|
||||
int useless = 0;
|
||||
int group = 1;
|
||||
int help = 0;
|
||||
|
|
@ -258,7 +241,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
{ "debug", no_argument, NULL, 'D' },
|
||||
{ "depth", required_argument, NULL, 0 },
|
||||
{ "filename", no_argument, NULL, 0 },
|
||||
{ "filename-pattern", required_argument, NULL, 'g' },
|
||||
{ "file-search-regex", required_argument, NULL, 'G' },
|
||||
{ "files-with-matches", no_argument, NULL, 'l' },
|
||||
{ "files-without-matches", no_argument, NULL, 'L' },
|
||||
|
|
@ -315,7 +297,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
{ "passthru", no_argument, &opts.passthrough, 1 },
|
||||
{ "path-to-ignore", required_argument, NULL, 'p' },
|
||||
{ "print0", no_argument, NULL, '0' },
|
||||
{ "print-all-files", no_argument, NULL, 0 },
|
||||
{ "print-long-lines", no_argument, &opts.print_long_lines, 1 },
|
||||
{ "recurse", no_argument, NULL, 'r' },
|
||||
{ "search-binary", no_argument, &opts.search_binary_files, 1 },
|
||||
|
|
@ -434,7 +415,7 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
case 'g':
|
||||
needs_query = accepts_query = 0;
|
||||
opts.match_files = 1;
|
||||
/* fall through */
|
||||
/* Fall through so regex is built */
|
||||
case 'G':
|
||||
if (file_search_regex) {
|
||||
log_err("File search regex (-g or -G) already specified.");
|
||||
|
|
@ -453,9 +434,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
opts.casing = CASE_INSENSITIVE;
|
||||
break;
|
||||
case 'L':
|
||||
opts.print_nonmatching_files = 1;
|
||||
opts.print_path = PATH_PRINT_TOP;
|
||||
break;
|
||||
opts.invert_match = 1;
|
||||
/* fall through */
|
||||
case 'l':
|
||||
needs_query = 0;
|
||||
opts.print_filename_only = 1;
|
||||
|
|
@ -524,7 +504,7 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
break;
|
||||
case 0: /* Long option */
|
||||
if (strcmp(longopts[opt_index].name, "ackmate-dir-filter") == 0) {
|
||||
compile_study(&opts.ackmate_dir_filter, &opts.ackmate_dir_filter_extra, optarg, 0, 0);
|
||||
compile_study(&opts.ackmate_dir_filter, &opts.ackmate_dir_filter_ctx, optarg, 0, 0);
|
||||
break;
|
||||
} else if (strcmp(longopts[opt_index].name, "depth") == 0) {
|
||||
opts.max_search_depth = atoi(optarg);
|
||||
|
|
@ -552,9 +532,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
} else if (strcmp(longopts[opt_index].name, "pager") == 0) {
|
||||
opts.pager = optarg;
|
||||
break;
|
||||
} else if (strcmp(longopts[opt_index].name, "print-all-files") == 0) {
|
||||
opts.print_all_paths = TRUE;
|
||||
break;
|
||||
} else if (strcmp(longopts[opt_index].name, "workers") == 0) {
|
||||
opts.workers = atoi(optarg);
|
||||
break;
|
||||
|
|
@ -597,7 +574,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
}
|
||||
|
||||
log_err("option %s does not take a value", longopts[opt_index].name);
|
||||
/* fall through */
|
||||
default:
|
||||
usage();
|
||||
exit(1);
|
||||
|
|
@ -611,21 +587,21 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
if (file_search_regex) {
|
||||
int pcre_opts = 0;
|
||||
if (opts.casing == CASE_INSENSITIVE || (opts.casing == CASE_SMART && is_lowercase(file_search_regex))) {
|
||||
pcre_opts |= PCRE_CASELESS;
|
||||
pcre_opts |= PCRE2_CASELESS;
|
||||
}
|
||||
if (opts.word_regexp) {
|
||||
char *old_file_search_regex = file_search_regex;
|
||||
ag_asprintf(&file_search_regex, "\\b%s\\b", file_search_regex);
|
||||
free(old_file_search_regex);
|
||||
}
|
||||
compile_study(&opts.file_search_regex, &opts.file_search_regex_extra, file_search_regex, pcre_opts, 0);
|
||||
compile_study(&opts.file_search_regex, &opts.file_search_regex_ctx, file_search_regex, pcre_opts, 0);
|
||||
free(file_search_regex);
|
||||
}
|
||||
|
||||
if (has_filetype) {
|
||||
num_exts = combine_file_extensions(ext_index, lang_num, &extensions);
|
||||
lang_regex = make_lang_regex(extensions, num_exts);
|
||||
compile_study(&opts.file_search_regex, &opts.file_search_regex_extra, lang_regex, 0, 0);
|
||||
compile_study(&opts.file_search_regex, &opts.file_search_regex_ctx, lang_regex, 0, 0);
|
||||
}
|
||||
|
||||
if (extensions) {
|
||||
|
|
@ -713,10 +689,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
const char *config_home = getenv("XDG_CONFIG_HOME");
|
||||
if (config_home) {
|
||||
ag_asprintf(&gitconfig_res, "%s/%s", config_home, "git/ignore");
|
||||
} else if (home_dir) {
|
||||
ag_asprintf(&gitconfig_res, "%s/%s", home_dir, ".config/git/ignore");
|
||||
} else {
|
||||
gitconfig_res = ag_strdup("");
|
||||
ag_asprintf(&gitconfig_res, "%s/%s", home_dir, ".config/git/ignore");
|
||||
}
|
||||
}
|
||||
log_debug("global core.excludesfile: %s", gitconfig_res);
|
||||
|
|
@ -774,13 +748,8 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
}
|
||||
|
||||
if (accepts_query && argc > 0) {
|
||||
if (!needs_query && strlen(argv[0]) == 0) {
|
||||
// use default query
|
||||
opts.query = ag_strdup(".");
|
||||
} else {
|
||||
// use the provided query
|
||||
opts.query = ag_strdup(argv[0]);
|
||||
}
|
||||
// use the provided query
|
||||
opts.query = ag_strdup(argv[0]);
|
||||
argc--;
|
||||
argv++;
|
||||
} else if (!needs_query) {
|
||||
|
|
@ -801,7 +770,6 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
}
|
||||
|
||||
char *path = NULL;
|
||||
char *base_path = NULL;
|
||||
#ifdef PATH_MAX
|
||||
char *tmp = NULL;
|
||||
#endif
|
||||
|
|
@ -819,20 +787,10 @@ void parse_options(int argc, char **argv, char **base_paths[], char **paths[]) {
|
|||
(*paths)[i] = path;
|
||||
#ifdef PATH_MAX
|
||||
tmp = ag_malloc(PATH_MAX);
|
||||
base_path = realpath(path, tmp);
|
||||
(*base_paths)[i] = realpath(path, tmp);
|
||||
#else
|
||||
base_path = realpath(path, NULL);
|
||||
(*base_paths)[i] = realpath(path, NULL);
|
||||
#endif
|
||||
if (base_path) {
|
||||
base_path_len = strlen(base_path);
|
||||
/* add trailing slash */
|
||||
if (base_path_len > 1 && base_path[base_path_len - 1] != '/') {
|
||||
base_path = ag_realloc(base_path, base_path_len + 2);
|
||||
base_path[base_path_len] = '/';
|
||||
base_path[base_path_len + 1] = '\0';
|
||||
}
|
||||
}
|
||||
(*base_paths)[i] = base_path;
|
||||
}
|
||||
/* Make sure we search these paths instead of stdin. */
|
||||
opts.search_stream = 0;
|
||||
|
|
|
|||
|
|
@ -1,10 +1,12 @@
|
|||
#ifndef OPTIONS_H
|
||||
#define OPTIONS_H
|
||||
|
||||
#include "config.h"
|
||||
|
||||
#include <getopt.h>
|
||||
#include <sys/stat.h>
|
||||
|
||||
#include <pcre.h>
|
||||
#include <pcre2.h>
|
||||
|
||||
#define DEFAULT_AFTER_LEN 2
|
||||
#define DEFAULT_BEFORE_LEN 2
|
||||
|
|
@ -28,15 +30,15 @@ enum path_print_behavior {
|
|||
|
||||
typedef struct {
|
||||
int ackmate;
|
||||
pcre *ackmate_dir_filter;
|
||||
pcre_extra *ackmate_dir_filter_extra;
|
||||
pcre2_code *ackmate_dir_filter;
|
||||
pcre2_compile_context *ackmate_dir_filter_ctx;
|
||||
size_t after;
|
||||
size_t before;
|
||||
enum case_behavior casing;
|
||||
const char *file_search_string;
|
||||
int match_files;
|
||||
pcre *file_search_regex;
|
||||
pcre_extra *file_search_regex_extra;
|
||||
pcre2_code *file_search_regex;
|
||||
pcre2_compile_context *file_search_regex_ctx;
|
||||
int color;
|
||||
char *color_line_number;
|
||||
char *color_match;
|
||||
|
|
@ -60,14 +62,12 @@ typedef struct {
|
|||
int print_break;
|
||||
int print_count;
|
||||
int print_filename_only;
|
||||
int print_nonmatching_files;
|
||||
int print_path;
|
||||
int print_all_paths;
|
||||
int print_line_numbers;
|
||||
int print_long_lines; /* TODO: support this in print.c */
|
||||
int passthrough;
|
||||
pcre *re;
|
||||
pcre_extra *re_extra;
|
||||
pcre2_code *re;
|
||||
pcre2_compile_context *re_ctx;
|
||||
int recurse_dirs;
|
||||
int search_all_files;
|
||||
int skip_vcs_ignores;
|
||||
|
|
@ -92,7 +92,7 @@ typedef struct {
|
|||
} cli_options;
|
||||
|
||||
/* global options. parse_options gives it sane values, everything else reads from it */
|
||||
extern cli_options opts;
|
||||
cli_options opts;
|
||||
|
||||
typedef struct option option_t;
|
||||
|
||||
|
|
|
|||
18
src/print.c
18
src/print.c
|
|
@ -26,7 +26,6 @@ __thread struct print_context {
|
|||
size_t prev_line;
|
||||
size_t last_prev_line;
|
||||
size_t prev_line_offset;
|
||||
size_t line_preceding_current_match_offset;
|
||||
size_t lines_since_last_match;
|
||||
size_t last_printed_match;
|
||||
int in_a_match;
|
||||
|
|
@ -42,7 +41,6 @@ void print_init_context(void) {
|
|||
print_context.prev_line = 0;
|
||||
print_context.last_prev_line = 0;
|
||||
print_context.prev_line_offset = 0;
|
||||
print_context.line_preceding_current_match_offset = 0;
|
||||
print_context.lines_since_last_match = INT_MAX;
|
||||
print_context.last_printed_match = 0;
|
||||
print_context.in_a_match = FALSE;
|
||||
|
|
@ -150,7 +148,6 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
|
|||
ssize_t lines_to_print = 0;
|
||||
char sep = '-';
|
||||
size_t i, j;
|
||||
int blanks_between_matches = opts.context || opts.after || opts.before;
|
||||
|
||||
if (opts.ackmate || opts.vimgrep) {
|
||||
sep = ':';
|
||||
|
|
@ -176,7 +173,7 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
|
|||
if (cur_match < matches_len && i == matches[cur_match].start) {
|
||||
print_context.in_a_match = TRUE;
|
||||
/* We found the start of a match */
|
||||
if (cur_match > 0 && blanks_between_matches && print_context.lines_since_last_match > (opts.before + opts.after + 1)) {
|
||||
if (cur_match > 0 && opts.context && print_context.lines_since_last_match > (opts.before + opts.after + 1)) {
|
||||
fprintf(out_fd, "--\n");
|
||||
}
|
||||
|
||||
|
|
@ -225,10 +222,14 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
|
|||
/* print headers for ackmate to parse */
|
||||
print_line_number(print_context.line, ';');
|
||||
for (; print_context.last_printed_match < cur_match; print_context.last_printed_match++) {
|
||||
size_t start = matches[print_context.last_printed_match].start - print_context.line_preceding_current_match_offset;
|
||||
fprintf(out_fd, "%lu %lu",
|
||||
/* Don't print negative offsets. This isn't quite right, but not many people use --ackmate */
|
||||
long start = (long)(matches[print_context.last_printed_match].start - print_context.prev_line_offset);
|
||||
if (start < 0) {
|
||||
start = 0;
|
||||
}
|
||||
fprintf(out_fd, "%li %li",
|
||||
start,
|
||||
matches[print_context.last_printed_match].end - matches[print_context.last_printed_match].start);
|
||||
(long)(matches[print_context.last_printed_match].end - matches[print_context.last_printed_match].start));
|
||||
print_context.last_printed_match == cur_match - 1 ? fputc(':', out_fd) : fputc(',', out_fd);
|
||||
}
|
||||
print_line(buf, i, print_context.prev_line_offset);
|
||||
|
|
@ -315,9 +316,6 @@ void print_file_matches(const char *path, const char *buf, const size_t buf_len,
|
|||
print_trailing_context(path, &buf[print_context.prev_line_offset], i - print_context.prev_line_offset);
|
||||
|
||||
print_context.prev_line_offset = i + 1; /* skip the newline */
|
||||
if (!print_context.in_a_match) {
|
||||
print_context.line_preceding_current_match_offset = i + 1;
|
||||
}
|
||||
|
||||
/* File doesn't end with a newline. Print one so the output is pretty. */
|
||||
if (i == buf_len && buf[i - 1] != '\n') {
|
||||
|
|
|
|||
|
|
@ -7,7 +7,6 @@ typedef struct {
|
|||
const ignores *ig;
|
||||
const char *base_path;
|
||||
size_t base_path_len;
|
||||
const char *path_start;
|
||||
} scandir_baton_t;
|
||||
|
||||
typedef int (*filter_fp)(const char *path, const struct dirent *, void *);
|
||||
|
|
|
|||
172
src/search.c
172
src/search.c
|
|
@ -2,32 +2,18 @@
|
|||
#include "print.h"
|
||||
#include "scandir.h"
|
||||
|
||||
size_t alpha_skip_lookup[256];
|
||||
size_t *find_skip_lookup;
|
||||
uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
|
||||
|
||||
work_queue_t *work_queue = NULL;
|
||||
work_queue_t *work_queue_tail = NULL;
|
||||
int done_adding_files = 0;
|
||||
pthread_cond_t files_ready = PTHREAD_COND_INITIALIZER;
|
||||
pthread_mutex_t stats_mtx = PTHREAD_MUTEX_INITIALIZER;
|
||||
pthread_mutex_t work_queue_mtx = PTHREAD_MUTEX_INITIALIZER;
|
||||
|
||||
symdir_t *symhash = NULL;
|
||||
|
||||
/* Returns: -1 if skipped, otherwise # of matches */
|
||||
ssize_t search_buf(const char *buf, const size_t buf_len,
|
||||
const char *dir_full_path) {
|
||||
void search_buf(const char *buf, const size_t buf_len,
|
||||
const char *dir_full_path) {
|
||||
int binary = -1; /* 1 = yes, 0 = no, -1 = don't know */
|
||||
size_t buf_offset = 0;
|
||||
|
||||
if (opts.search_stream) {
|
||||
binary = 0;
|
||||
} else if (!opts.search_binary_files && opts.mmap) { /* if not using mmap, binary files have already been skipped */
|
||||
} else if (!opts.search_binary_files) {
|
||||
binary = is_binary((const void *)buf, buf_len);
|
||||
if (binary) {
|
||||
log_debug("File %s is binary. Skipping...", dir_full_path);
|
||||
return -1;
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -59,18 +45,18 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
|
|||
matches_len = 1;
|
||||
} else if (opts.literal) {
|
||||
const char *match_ptr = buf;
|
||||
strncmp_fp ag_strnstr_fp = get_strstr(opts.casing);
|
||||
|
||||
while (buf_offset < buf_len) {
|
||||
/* hash_strnstr only for little-endian platforms that allow unaligned access */
|
||||
#if defined(__i386__) || defined(__x86_64__)
|
||||
/* Decide whether to fall back on boyer-moore */
|
||||
if ((size_t)opts.query_len < 2 * sizeof(uint16_t) - 1 || opts.query_len >= UCHAR_MAX) {
|
||||
match_ptr = boyer_moore_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup, opts.casing == CASE_INSENSITIVE);
|
||||
} else {
|
||||
if ((size_t)opts.query_len < 2 * sizeof(uint16_t) - 1 || opts.query_len >= UCHAR_MAX)
|
||||
match_ptr = ag_strnstr_fp(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup);
|
||||
else
|
||||
match_ptr = hash_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, h_table, opts.casing == CASE_SENSITIVE);
|
||||
}
|
||||
#else
|
||||
match_ptr = boyer_moore_strnstr(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup, opts.casing == CASE_INSENSITIVE);
|
||||
match_ptr = ag_strnstr_fp(match_ptr, opts.query, buf_len - buf_offset, opts.query_len, alpha_skip_lookup, find_skip_lookup);
|
||||
#endif
|
||||
|
||||
if (match_ptr == NULL) {
|
||||
|
|
@ -114,8 +100,11 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
|
|||
} else {
|
||||
int offset_vector[3];
|
||||
if (opts.multiline) {
|
||||
/* we just care about the match, not where the matches are */
|
||||
return pcre2_match(opts.ackmate_dir_filter, dir_name, strlen(dir_name), 0, 0, NULL, NULL);
|
||||
|
||||
while (buf_offset < buf_len &&
|
||||
(pcre_exec(opts.re, opts.re_extra, buf, buf_len, buf_offset, 0, offset_vector, 3)) >= 0) {
|
||||
(pcre2_match(opts.re, buf, buf_len, buf_offset, 0, match_data)) >= 0) {
|
||||
log_debug("Regex match found. File %s, offset %i bytes.", dir_full_path, offset_vector[0]);
|
||||
buf_offset = offset_vector[1];
|
||||
if (offset_vector[0] == offset_vector[1]) {
|
||||
|
|
@ -143,7 +132,7 @@ ssize_t search_buf(const char *buf, const size_t buf_len,
|
|||
}
|
||||
size_t line_offset = 0;
|
||||
while (line_offset < line_len) {
|
||||
int rv = pcre_exec(opts.re, opts.re_extra, line, line_len, line_offset, 0, offset_vector, 3);
|
||||
int rv = pcre2_match(opts.re, opts.re_ctx, line, line_len, line_offset, 0, offset_vector, 3);
|
||||
if (rv < 0) {
|
||||
break;
|
||||
}
|
||||
|
|
@ -188,16 +177,25 @@ multiline_done:
|
|||
pthread_mutex_unlock(&stats_mtx);
|
||||
}
|
||||
|
||||
if (!opts.print_nonmatching_files && (matches_len > 0 || opts.print_all_paths)) {
|
||||
if (matches_len > 0) {
|
||||
if (binary == -1 && !opts.print_filename_only) {
|
||||
binary = is_binary((const void *)buf, buf_len);
|
||||
}
|
||||
pthread_mutex_lock(&print_mtx);
|
||||
if (opts.print_filename_only) {
|
||||
if (opts.print_count) {
|
||||
print_path_count(dir_full_path, opts.path_sep, (size_t)matches_len);
|
||||
} else {
|
||||
print_path(dir_full_path, opts.path_sep);
|
||||
/* If the --files-without-matches or -L option is passed we should
|
||||
* not print a matching line. This option currently sets
|
||||
* opts.print_filename_only and opts.invert_match. Unfortunately
|
||||
* setting the latter has the side effect of making matches.len = 1
|
||||
* on a file-without-matches which is not desired behaviour. See
|
||||
* GitHub issue 206 for the consequences if this behaviour is not
|
||||
* checked. */
|
||||
if (!opts.invert_match || matches_len < 2) {
|
||||
if (opts.print_count) {
|
||||
print_path_count(dir_full_path, opts.path_sep, (size_t)matches_len);
|
||||
} else {
|
||||
print_path(dir_full_path, opts.path_sep);
|
||||
}
|
||||
}
|
||||
} else if (binary) {
|
||||
print_binary_file_matches(dir_full_path);
|
||||
|
|
@ -219,16 +217,11 @@ multiline_done:
|
|||
if (matches_size > 0) {
|
||||
free(matches);
|
||||
}
|
||||
|
||||
/* FIXME: handle case where matches_len > SSIZE_MAX */
|
||||
return (ssize_t)matches_len;
|
||||
}
|
||||
|
||||
/* Return value: -1 if skipped, otherwise # of matches */
|
||||
/* TODO: this will only match single lines. multi-line regexes silently don't match */
|
||||
ssize_t search_stream(FILE *stream, const char *path) {
|
||||
void search_stream(FILE *stream, const char *path) {
|
||||
char *line = NULL;
|
||||
ssize_t matches_count = 0;
|
||||
ssize_t line_len = 0;
|
||||
size_t line_cap = 0;
|
||||
size_t i;
|
||||
|
|
@ -236,17 +229,8 @@ ssize_t search_stream(FILE *stream, const char *path) {
|
|||
print_init_context();
|
||||
|
||||
for (i = 1; (line_len = getline(&line, &line_cap, stream)) > 0; i++) {
|
||||
ssize_t result;
|
||||
opts.stream_line_num = i;
|
||||
result = search_buf(line, line_len, path);
|
||||
if (result > 0) {
|
||||
if (matches_count == -1) {
|
||||
matches_count = 0;
|
||||
}
|
||||
matches_count += result;
|
||||
} else if (matches_count <= 0 && result == -1) {
|
||||
matches_count = -1;
|
||||
}
|
||||
search_buf(line, line_len, path);
|
||||
if (line[line_len - 1] == '\n') {
|
||||
line_len--;
|
||||
}
|
||||
|
|
@ -255,35 +239,16 @@ ssize_t search_stream(FILE *stream, const char *path) {
|
|||
|
||||
free(line);
|
||||
print_cleanup_context();
|
||||
return matches_count;
|
||||
}
|
||||
|
||||
void search_file(const char *file_full_path) {
|
||||
int fd = -1;
|
||||
int fd;
|
||||
off_t f_len = 0;
|
||||
char *buf = NULL;
|
||||
struct stat statbuf;
|
||||
int rv = 0;
|
||||
int matches_count = -1;
|
||||
FILE *fp = NULL;
|
||||
|
||||
rv = stat(file_full_path, &statbuf);
|
||||
if (rv != 0) {
|
||||
log_err("Skipping %s: Error fstat()ing file.", file_full_path);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
if (opts.stdout_inode != 0 && opts.stdout_inode == statbuf.st_ino) {
|
||||
log_debug("Skipping %s: stdout is redirected to it", file_full_path);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
// handling only regular files and FIFOs
|
||||
if (!S_ISREG(statbuf.st_mode) && !S_ISFIFO(statbuf.st_mode)) {
|
||||
log_err("Skipping %s: Mode %u is not a file.", file_full_path, statbuf.st_mode);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
fd = open(file_full_path, O_RDONLY);
|
||||
if (fd < 0) {
|
||||
/* XXXX: strerror is not thread-safe */
|
||||
|
|
@ -291,7 +256,6 @@ void search_file(const char *file_full_path) {
|
|||
goto cleanup;
|
||||
}
|
||||
|
||||
// repeating stat check with file handle to prevent TOCTOU issue
|
||||
rv = fstat(fd, &statbuf);
|
||||
if (rv != 0) {
|
||||
log_err("Skipping %s: Error fstat()ing file.", file_full_path);
|
||||
|
|
@ -303,8 +267,7 @@ void search_file(const char *file_full_path) {
|
|||
goto cleanup;
|
||||
}
|
||||
|
||||
// handling only regular files and FIFOs
|
||||
if (!S_ISREG(statbuf.st_mode) && !S_ISFIFO(statbuf.st_mode)) {
|
||||
if ((statbuf.st_mode & S_IFMT) == 0) {
|
||||
log_err("Skipping %s: Mode %u is not a file.", file_full_path, statbuf.st_mode);
|
||||
goto cleanup;
|
||||
}
|
||||
|
|
@ -314,7 +277,7 @@ void search_file(const char *file_full_path) {
|
|||
if (statbuf.st_mode & S_IFIFO) {
|
||||
log_debug("%s is a named pipe. stream searching", file_full_path);
|
||||
fp = fdopen(fd, "r");
|
||||
matches_count = search_stream(fp, file_full_path);
|
||||
search_stream(fp, file_full_path);
|
||||
fclose(fp);
|
||||
goto cleanup;
|
||||
}
|
||||
|
|
@ -323,18 +286,13 @@ void search_file(const char *file_full_path) {
|
|||
|
||||
if (f_len == 0) {
|
||||
if (opts.query[0] == '.' && opts.query_len == 1 && !opts.literal && opts.search_all_files) {
|
||||
matches_count = search_buf(buf, f_len, file_full_path);
|
||||
search_buf(buf, f_len, file_full_path);
|
||||
} else {
|
||||
log_debug("Skipping %s: file is empty.", file_full_path);
|
||||
}
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
if (!opts.literal && f_len > INT_MAX) {
|
||||
log_err("Skipping %s: pcre_exec() can't handle files larger than %i bytes.", file_full_path, INT_MAX);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
#ifdef _WIN32
|
||||
{
|
||||
HANDLE hmmap = CreateFileMapping(
|
||||
|
|
@ -368,23 +326,9 @@ void search_file(const char *file_full_path) {
|
|||
#endif
|
||||
} else {
|
||||
buf = ag_malloc(f_len);
|
||||
|
||||
ssize_t bytes_read = 0;
|
||||
|
||||
if (!opts.search_binary_files) {
|
||||
bytes_read += read(fd, buf, ag_min(f_len, 512));
|
||||
// Optimization: If skipping binary files, don't read the whole buffer before checking if binary or not.
|
||||
if (is_binary(buf, f_len)) {
|
||||
log_debug("File %s is binary. Skipping...", file_full_path);
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
while (bytes_read < f_len) {
|
||||
bytes_read += read(fd, buf + bytes_read, f_len);
|
||||
}
|
||||
if (bytes_read != f_len) {
|
||||
die("File %s read(): expected to read %u bytes but read %u", file_full_path, f_len, bytes_read);
|
||||
size_t bytes_read = read(fd, buf, f_len);
|
||||
if ((off_t)bytes_read != f_len) {
|
||||
die("expected to read %u bytes but read %u", f_len, bytes_read);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
|
@ -392,45 +336,29 @@ void search_file(const char *file_full_path) {
|
|||
if (opts.search_zip_files) {
|
||||
ag_compression_type zip_type = is_zipped(buf, f_len);
|
||||
if (zip_type != AG_NO_COMPRESSION) {
|
||||
#if HAVE_FOPENCOOKIE
|
||||
log_debug("%s is a compressed file. stream searching", file_full_path);
|
||||
fp = decompress_open(fd, "r", zip_type);
|
||||
matches_count = search_stream(fp, file_full_path);
|
||||
fclose(fp);
|
||||
#else
|
||||
int _buf_len = (int)f_len;
|
||||
char *_buf = decompress(zip_type, buf, f_len, file_full_path, &_buf_len);
|
||||
if (_buf == NULL || _buf_len == 0) {
|
||||
log_err("Cannot decompress zipped file %s", file_full_path);
|
||||
goto cleanup;
|
||||
}
|
||||
matches_count = search_buf(_buf, _buf_len, file_full_path);
|
||||
search_buf(_buf, _buf_len, file_full_path);
|
||||
free(_buf);
|
||||
#endif
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
matches_count = search_buf(buf, f_len, file_full_path);
|
||||
search_buf(buf, f_len, file_full_path);
|
||||
|
||||
cleanup:
|
||||
|
||||
if (opts.print_nonmatching_files && matches_count == 0) {
|
||||
pthread_mutex_lock(&print_mtx);
|
||||
print_path(file_full_path, opts.path_sep);
|
||||
pthread_mutex_unlock(&print_mtx);
|
||||
opts.match_found = 1;
|
||||
}
|
||||
|
||||
print_cleanup_context();
|
||||
if (buf != NULL) {
|
||||
#ifdef _WIN32
|
||||
UnmapViewOfFile(buf);
|
||||
#else
|
||||
if (opts.mmap) {
|
||||
if (buf != MAP_FAILED) {
|
||||
munmap(buf, f_len);
|
||||
}
|
||||
munmap(buf, f_len);
|
||||
} else {
|
||||
free(buf);
|
||||
}
|
||||
|
|
@ -446,6 +374,7 @@ void *search_file_worker(void *i) {
|
|||
int worker_id = *(int *)i;
|
||||
|
||||
log_debug("Worker %i started", worker_id);
|
||||
match_data = pcre2_match_data_create_from_pattern(re, NULL);
|
||||
while (TRUE) {
|
||||
pthread_mutex_lock(&work_queue_mtx);
|
||||
while (work_queue == NULL) {
|
||||
|
|
@ -533,8 +462,6 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
|
|||
struct dirent *dir = NULL;
|
||||
scandir_baton_t scandir_baton;
|
||||
int results = 0;
|
||||
size_t base_path_len = 0;
|
||||
const char *path_start = path;
|
||||
|
||||
char *dir_full_path = NULL;
|
||||
const char *ignore_file = NULL;
|
||||
|
|
@ -550,7 +477,7 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
|
|||
}
|
||||
|
||||
/* find .*ignore files to load ignore patterns from */
|
||||
for (i = 0; opts.skip_vcs_ignores ? (i == 0) : (ignore_pattern_files[i] != NULL); i++) {
|
||||
for (i = 0; opts.skip_vcs_ignores ? (i <= 1) : (ignore_pattern_files[i] != NULL); i++) {
|
||||
ignore_file = ignore_pattern_files[i];
|
||||
ag_asprintf(&dir_full_path, "%s/%s", path, ignore_file);
|
||||
load_ignore_patterns(ig, dir_full_path);
|
||||
|
|
@ -558,20 +485,9 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
|
|||
dir_full_path = NULL;
|
||||
}
|
||||
|
||||
/* path_start is the part of path that isn't in base_path
|
||||
* base_path will have a trailing '/' because we put it there in parse_options
|
||||
*/
|
||||
base_path_len = base_path ? strlen(base_path) : 0;
|
||||
for (i = 0; ((size_t)i < base_path_len) && (path[i]) && (base_path[i] == path[i]); i++) {
|
||||
path_start = path + i + 1;
|
||||
}
|
||||
log_debug("search_dir: path is '%s', base_path is '%s', path_start is '%s'", path, base_path, path_start);
|
||||
|
||||
scandir_baton.ig = ig;
|
||||
scandir_baton.base_path = base_path;
|
||||
scandir_baton.base_path_len = base_path_len;
|
||||
scandir_baton.path_start = path_start;
|
||||
|
||||
scandir_baton.base_path_len = base_path ? strlen(base_path) : 0;
|
||||
results = ag_scandir(path, &dir_list, &filename_filter, &scandir_baton);
|
||||
if (results == 0) {
|
||||
log_debug("No results found in directory %s", path);
|
||||
|
|
@ -626,7 +542,7 @@ void search_dir(ignores *ig, const char *base_path, const char *path, const int
|
|||
|
||||
if (!is_directory(path, dir)) {
|
||||
if (opts.file_search_regex) {
|
||||
rc = pcre_exec(opts.file_search_regex, NULL, dir_full_path, strlen(dir_full_path),
|
||||
rc = pcre2_match(opts.file_search_regex, NULL, dir_full_path, strlen(dir_full_path),
|
||||
0, 0, offset_vector, 3);
|
||||
if (rc < 0) { /* no match */
|
||||
log_debug("Skipping %s due to file_search_regex.", dir_full_path);
|
||||
|
|
|
|||
32
src/search.h
32
src/search.h
|
|
@ -1,11 +1,13 @@
|
|||
#ifndef SEARCH_H
|
||||
#define SEARCH_H
|
||||
|
||||
#include "config.h"
|
||||
|
||||
#include <dirent.h>
|
||||
#include <errno.h>
|
||||
#include <fcntl.h>
|
||||
#include <limits.h>
|
||||
#include <pcre.h>
|
||||
#include <pcre2.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
|
@ -17,8 +19,6 @@
|
|||
#include <sys/stat.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include "config.h"
|
||||
|
||||
#ifdef HAVE_PTHREAD_H
|
||||
#include <pthread.h>
|
||||
#endif
|
||||
|
|
@ -31,9 +31,9 @@
|
|||
#include "uthash.h"
|
||||
#include "util.h"
|
||||
|
||||
extern size_t alpha_skip_lookup[256];
|
||||
extern size_t *find_skip_lookup;
|
||||
extern uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
|
||||
size_t alpha_skip_lookup[256];
|
||||
size_t *find_skip_lookup;
|
||||
uint8_t h_table[H_SIZE] __attribute__((aligned(64)));
|
||||
|
||||
struct work_queue_t {
|
||||
char *path;
|
||||
|
|
@ -41,12 +41,12 @@ struct work_queue_t {
|
|||
};
|
||||
typedef struct work_queue_t work_queue_t;
|
||||
|
||||
extern work_queue_t *work_queue;
|
||||
extern work_queue_t *work_queue_tail;
|
||||
extern int done_adding_files;
|
||||
extern pthread_cond_t files_ready;
|
||||
extern pthread_mutex_t stats_mtx;
|
||||
extern pthread_mutex_t work_queue_mtx;
|
||||
work_queue_t *work_queue;
|
||||
work_queue_t *work_queue_tail;
|
||||
int done_adding_files;
|
||||
pthread_cond_t files_ready;
|
||||
pthread_mutex_t stats_mtx;
|
||||
pthread_mutex_t work_queue_mtx;
|
||||
|
||||
|
||||
/* For symlink loop detection */
|
||||
|
|
@ -64,11 +64,11 @@ typedef struct {
|
|||
UT_hash_handle hh;
|
||||
} symdir_t;
|
||||
|
||||
extern symdir_t *symhash;
|
||||
symdir_t *symhash;
|
||||
|
||||
ssize_t search_buf(const char *buf, const size_t buf_len,
|
||||
const char *dir_full_path);
|
||||
ssize_t search_stream(FILE *stream, const char *path);
|
||||
void search_buf(const char *buf, const size_t buf_len,
|
||||
const char *dir_full_path);
|
||||
void search_stream(FILE *stream, const char *path);
|
||||
void search_file(const char *file_full_path);
|
||||
|
||||
void *search_file_worker(void *i);
|
||||
|
|
|
|||
10
src/uthash.h
10
src/uthash.h
|
|
@ -457,34 +457,24 @@ typedef unsigned char uint8_t;
|
|||
switch (_hj_k) { \
|
||||
case 11: \
|
||||
hashv += ((unsigned)_hj_key[10] << 24); \
|
||||
/* fall through */ \
|
||||
case 10: \
|
||||
hashv += ((unsigned)_hj_key[9] << 16); \
|
||||
/* fall through */ \
|
||||
case 9: \
|
||||
hashv += ((unsigned)_hj_key[8] << 8); \
|
||||
/* fall through */ \
|
||||
case 8: \
|
||||
_hj_j += ((unsigned)_hj_key[7] << 24); \
|
||||
/* fall through */ \
|
||||
case 7: \
|
||||
_hj_j += ((unsigned)_hj_key[6] << 16); \
|
||||
/* fall through */ \
|
||||
case 6: \
|
||||
_hj_j += ((unsigned)_hj_key[5] << 8); \
|
||||
/* fall through */ \
|
||||
case 5: \
|
||||
_hj_j += _hj_key[4]; \
|
||||
/* fall through */ \
|
||||
case 4: \
|
||||
_hj_i += ((unsigned)_hj_key[3] << 24); \
|
||||
/* fall through */ \
|
||||
case 3: \
|
||||
_hj_i += ((unsigned)_hj_key[2] << 16); \
|
||||
/* fall through */ \
|
||||
case 2: \
|
||||
_hj_i += ((unsigned)_hj_key[1] << 8); \
|
||||
/* fall through */ \
|
||||
case 1: \
|
||||
_hj_i += _hj_key[0]; \
|
||||
} \
|
||||
|
|
|
|||
75
src/util.c
75
src/util.c
|
|
@ -1,3 +1,5 @@
|
|||
#include "config.h"
|
||||
|
||||
#include <ctype.h>
|
||||
#include <stdarg.h>
|
||||
#include <stdio.h>
|
||||
|
|
@ -5,7 +7,6 @@
|
|||
#include <string.h>
|
||||
#include <sys/stat.h>
|
||||
|
||||
#include "config.h"
|
||||
#include "util.h"
|
||||
|
||||
#ifdef _WIN32
|
||||
|
|
@ -21,8 +22,6 @@
|
|||
} \
|
||||
return ptr;
|
||||
|
||||
FILE *out_fd = NULL;
|
||||
ag_stats stats;
|
||||
void *ag_malloc(size_t size) {
|
||||
void *ptr = malloc(size);
|
||||
CHECK_AND_RETURN(ptr)
|
||||
|
|
@ -150,13 +149,6 @@ size_t ag_max(size_t a, size_t b) {
|
|||
return a;
|
||||
}
|
||||
|
||||
size_t ag_min(size_t a, size_t b) {
|
||||
if (b < a) {
|
||||
return b;
|
||||
}
|
||||
return a;
|
||||
}
|
||||
|
||||
void generate_hash(const char *find, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
|
||||
int i;
|
||||
for (i = f_len - sizeof(uint16_t); i >= 0; i--) {
|
||||
|
|
@ -185,12 +177,12 @@ void generate_hash(const char *find, const size_t f_len, uint8_t *h_table, const
|
|||
|
||||
/* Boyer-Moore strstr */
|
||||
const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len,
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup, const int case_insensitive) {
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup) {
|
||||
ssize_t i;
|
||||
size_t pos = f_len - 1;
|
||||
|
||||
while (pos < s_len) {
|
||||
for (i = f_len - 1; i >= 0 && (case_insensitive ? tolower(s[pos]) : s[pos]) == find[i]; pos--, i--) {
|
||||
for (i = f_len - 1; i >= 0 && s[pos] == find[i]; pos--, i--) {
|
||||
}
|
||||
if (i < 0) {
|
||||
return s + pos + 1;
|
||||
|
|
@ -201,9 +193,25 @@ const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_
|
|||
return NULL;
|
||||
}
|
||||
|
||||
// Clang's -fsanitize=alignment (included in -fsanitize=undefined) will flag
|
||||
// the intentional unaligned access here, so suppress it for this function
|
||||
NO_SANITIZE_ALIGNMENT const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
|
||||
/* Copy-pasted from above. Yes I know this is bad. One day I might even fix it. */
|
||||
const char *boyer_moore_strncasestr(const char *s, const char *find, const size_t s_len, const size_t f_len,
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup) {
|
||||
ssize_t i;
|
||||
size_t pos = f_len - 1;
|
||||
|
||||
while (pos < s_len) {
|
||||
for (i = f_len - 1; i >= 0 && tolower(s[pos]) == find[i]; pos--, i--) {
|
||||
}
|
||||
if (i < 0) {
|
||||
return s + pos + 1;
|
||||
}
|
||||
pos += ag_max(alpha_skip_lookup[(unsigned char)s[pos]], find_skip_lookup[i]);
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive) {
|
||||
if (s_len < f_len)
|
||||
return NULL;
|
||||
|
||||
|
|
@ -239,6 +247,17 @@ NO_SANITIZE_ALIGNMENT const char *hash_strnstr(const char *s, const char *find,
|
|||
return NULL;
|
||||
}
|
||||
|
||||
|
||||
strncmp_fp get_strstr(enum case_behavior casing) {
|
||||
strncmp_fp ag_strncmp_fp = &boyer_moore_strnstr;
|
||||
|
||||
if (casing == CASE_INSENSITIVE) {
|
||||
ag_strncmp_fp = &boyer_moore_strncasestr;
|
||||
}
|
||||
|
||||
return ag_strncmp_fp;
|
||||
}
|
||||
|
||||
size_t invert_matches(const char *buf, const size_t buf_len, match_t matches[], size_t matches_len) {
|
||||
size_t i;
|
||||
size_t match_read_index = 0;
|
||||
|
|
@ -313,19 +332,23 @@ void realloc_matches(match_t **matches, size_t *matches_size, size_t matches_len
|
|||
*matches = ag_realloc(*matches, *matches_size * sizeof(match_t));
|
||||
}
|
||||
|
||||
void compile_study(pcre **re, pcre_extra **re_extra, char *q, const int pcre_opts, const int study_opts) {
|
||||
void compile_study(pcre2_code **re, pcre2_compile_context **re_ctx, char *q, const uint32_t pcre_opts, const int study_opts) {
|
||||
const char *pcre_err = NULL;
|
||||
int pcre_err_offset = 0;
|
||||
|
||||
*re = pcre_compile(q, pcre_opts, &pcre_err, &pcre_err_offset, NULL);
|
||||
*re = pcre2_compile(q, pcre_opts, &pcre_err, &pcre_err_offset, NULL, NULL);
|
||||
if (*re == NULL) {
|
||||
die("Bad regex! pcre_compile() failed at position %i: %s\nIf you meant to search for a literal string, run ag with -Q",
|
||||
// TODO: use pcre2_get_error_message()
|
||||
die("Bad regex! pcre2_compile() failed at position %i: %s\nIf you meant to search for a literal string, run ag with -Q",
|
||||
pcre_err_offset,
|
||||
pcre_err);
|
||||
}
|
||||
*re_extra = pcre_study(*re, study_opts, &pcre_err);
|
||||
if (*re_extra == NULL) {
|
||||
log_debug("pcre_study returned nothing useful. Error: %s", pcre_err);
|
||||
pcre2_jit_compile(*re, pcre_opts);
|
||||
*re_ctx = NULL;
|
||||
*re_ctx = pcre2_match_data_create_from_pattern(*re, NULL);
|
||||
// *re_ctx = pcre2_init_context(NULL);
|
||||
if (*re_ctx == NULL) {
|
||||
log_debug("pcre2_init_context returned nothing useful. Error: %s", pcre_err);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -518,7 +541,7 @@ int is_symlink(const char *path, const struct dirent *d) {
|
|||
|
||||
int is_named_pipe(const char *path, const struct dirent *d) {
|
||||
#ifdef HAVE_DIRENT_DTYPE
|
||||
if (d->d_type != DT_UNKNOWN && d->d_type != DT_LNK) {
|
||||
if (d->d_type != DT_UNKNOWN) {
|
||||
return d->d_type == DT_FIFO || d->d_type == DT_SOCK;
|
||||
}
|
||||
#endif
|
||||
|
|
@ -530,11 +553,7 @@ int is_named_pipe(const char *path, const struct dirent *d) {
|
|||
return FALSE;
|
||||
}
|
||||
free(full_path);
|
||||
return S_ISFIFO(s.st_mode)
|
||||
#ifdef S_ISSOCK
|
||||
|| S_ISSOCK(s.st_mode)
|
||||
#endif
|
||||
;
|
||||
return S_ISFIFO(s.st_mode) || S_ISSOCK(s.st_mode);
|
||||
}
|
||||
|
||||
void ag_asprintf(char **ret, const char *fmt, ...) {
|
||||
|
|
@ -626,7 +645,7 @@ ssize_t getline(char **lineptr, size_t *n, FILE *stream) {
|
|||
ssize_t buf_getline(const char **line, const char *buf, const size_t buf_len, const size_t buf_offset) {
|
||||
const char *cur = buf + buf_offset;
|
||||
ssize_t i;
|
||||
for (i = 0; (buf_offset + i < buf_len) && cur[i] != '\n'; i++) {
|
||||
for (i = 0; cur[i] != '\n' && (buf_offset + i < buf_len); i++) {
|
||||
}
|
||||
*line = cur;
|
||||
return i;
|
||||
|
|
|
|||
32
src/util.h
32
src/util.h
|
|
@ -2,17 +2,18 @@
|
|||
#define UTIL_H
|
||||
|
||||
#include <dirent.h>
|
||||
#include <pcre.h>
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <sys/time.h>
|
||||
|
||||
#include "config.h"
|
||||
#include <pcre2.h>
|
||||
|
||||
#include "log.h"
|
||||
#include "options.h"
|
||||
|
||||
extern FILE *out_fd;
|
||||
FILE *out_fd;
|
||||
|
||||
#ifndef TRUE
|
||||
#define TRUE 1
|
||||
|
|
@ -24,12 +25,6 @@ extern FILE *out_fd;
|
|||
|
||||
#define H_SIZE (64 * 1024)
|
||||
|
||||
#ifdef __clang__
|
||||
#define NO_SANITIZE_ALIGNMENT __attribute__((no_sanitize("alignment")))
|
||||
#else
|
||||
#define NO_SANITIZE_ALIGNMENT
|
||||
#endif
|
||||
|
||||
void *ag_malloc(size_t size);
|
||||
void *ag_realloc(void *ptr, size_t size);
|
||||
void *ag_calloc(size_t nelem, size_t elsize);
|
||||
|
|
@ -42,16 +37,18 @@ typedef struct {
|
|||
} match_t;
|
||||
|
||||
typedef struct {
|
||||
size_t total_bytes;
|
||||
size_t total_files;
|
||||
size_t total_matches;
|
||||
size_t total_file_matches;
|
||||
long total_bytes;
|
||||
long total_files;
|
||||
long total_matches;
|
||||
long total_file_matches;
|
||||
struct timeval time_start;
|
||||
struct timeval time_end;
|
||||
} ag_stats;
|
||||
|
||||
|
||||
extern ag_stats stats;
|
||||
ag_stats stats;
|
||||
|
||||
typedef const char *(*strncmp_fp)(const char *, const char *, const size_t, const size_t, const size_t[], const size_t *);
|
||||
|
||||
/* Union to translate between chars and words without violating strict aliasing */
|
||||
typedef union {
|
||||
|
|
@ -69,15 +66,18 @@ void generate_hash(const char *find, const size_t f_len, uint8_t *H, const int c
|
|||
|
||||
/* max is already defined on spec-violating compilers such as MinGW */
|
||||
size_t ag_max(size_t a, size_t b);
|
||||
size_t ag_min(size_t a, size_t b);
|
||||
|
||||
const char *boyer_moore_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len,
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup, const int case_insensitive);
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup);
|
||||
const char *boyer_moore_strncasestr(const char *s, const char *find, const size_t s_len, const size_t f_len,
|
||||
const size_t alpha_skip_lookup[], const size_t *find_skip_lookup);
|
||||
const char *hash_strnstr(const char *s, const char *find, const size_t s_len, const size_t f_len, uint8_t *h_table, const int case_sensitive);
|
||||
|
||||
strncmp_fp get_strstr(enum case_behavior opts);
|
||||
|
||||
size_t invert_matches(const char *buf, const size_t buf_len, match_t matches[], size_t matches_len);
|
||||
void realloc_matches(match_t **matches, size_t *matches_size, size_t matches_len);
|
||||
void compile_study(pcre **re, pcre_extra **re_extra, char *q, const int pcre_opts, const int study_opts);
|
||||
void compile_study(pcre2_code **re, pcre2_compile_context **re_ctx, char *q, const uint32_t pcre_opts, const int study_opts);
|
||||
|
||||
|
||||
int is_binary(const void *buf, const size_t buf_len);
|
||||
|
|
|
|||
403
src/zfile.c
403
src/zfile.c
|
|
@ -1,403 +0,0 @@
|
|||
#ifdef __FreeBSD__
|
||||
#include <sys/endian.h>
|
||||
#endif
|
||||
#include <sys/types.h>
|
||||
|
||||
#ifdef __CYGWIN__
|
||||
typedef _off64_t off64_t;
|
||||
#endif
|
||||
|
||||
#include <assert.h>
|
||||
#include <errno.h>
|
||||
#include <inttypes.h>
|
||||
#include <limits.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "config.h"
|
||||
|
||||
#ifdef HAVE_ERR_H
|
||||
#include <err.h>
|
||||
#endif
|
||||
#ifdef HAVE_ZLIB_H
|
||||
#include <zlib.h>
|
||||
#endif
|
||||
#ifdef HAVE_LZMA_H
|
||||
#include <lzma.h>
|
||||
#endif
|
||||
|
||||
#include "decompress.h"
|
||||
|
||||
#if HAVE_FOPENCOOKIE
|
||||
|
||||
#define min(a, b) ({ \
|
||||
__typeof (a) _a = (a); \
|
||||
__typeof (b) _b = (b); \
|
||||
_a < _b ? _a : _b; })
|
||||
|
||||
static cookie_read_function_t zfile_read;
|
||||
static cookie_seek_function_t zfile_seek;
|
||||
static cookie_close_function_t zfile_close;
|
||||
|
||||
static const cookie_io_functions_t zfile_io = {
|
||||
.read = zfile_read,
|
||||
.write = NULL,
|
||||
.seek = zfile_seek,
|
||||
.close = zfile_close,
|
||||
};
|
||||
|
||||
#define KB (1024)
|
||||
struct zfile {
|
||||
FILE *in; // Source FILE stream
|
||||
uint64_t logic_offset, // Logical offset in output (forward seeks)
|
||||
decode_offset, // Where we've decoded to
|
||||
actual_len;
|
||||
uint32_t outbuf_start;
|
||||
|
||||
ag_compression_type ctype;
|
||||
|
||||
union {
|
||||
z_stream gz;
|
||||
lzma_stream lzma;
|
||||
} stream;
|
||||
|
||||
uint8_t inbuf[32 * KB];
|
||||
uint8_t outbuf[256 * KB];
|
||||
bool eof;
|
||||
};
|
||||
|
||||
#define CAVAIL_IN(c) ((c)->ctype == AG_GZIP ? (c)->stream.gz.avail_in : (c)->stream.lzma.avail_in)
|
||||
#define CNEXT_OUT(c) ((c)->ctype == AG_GZIP ? (c)->stream.gz.next_out : (c)->stream.lzma.next_out)
|
||||
|
||||
static int
|
||||
zfile_cookie_init(struct zfile *cookie) {
|
||||
#ifdef HAVE_LZMA_H
|
||||
lzma_ret lzrc;
|
||||
#endif
|
||||
int rc;
|
||||
|
||||
assert(cookie->logic_offset == 0);
|
||||
assert(cookie->decode_offset == 0);
|
||||
|
||||
cookie->actual_len = 0;
|
||||
|
||||
switch (cookie->ctype) {
|
||||
#ifdef HAVE_ZLIB_H
|
||||
case AG_GZIP:
|
||||
memset(&cookie->stream.gz, 0, sizeof cookie->stream.gz);
|
||||
rc = inflateInit2(&cookie->stream.gz, 32 + 15);
|
||||
if (rc != Z_OK) {
|
||||
log_err("Unable to initialize zlib: %s", zError(rc));
|
||||
return EIO;
|
||||
}
|
||||
cookie->stream.gz.next_in = NULL;
|
||||
cookie->stream.gz.avail_in = 0;
|
||||
cookie->stream.gz.next_out = cookie->outbuf;
|
||||
cookie->stream.gz.avail_out = sizeof cookie->outbuf;
|
||||
break;
|
||||
#endif
|
||||
#ifdef HAVE_LZMA_H
|
||||
case AG_XZ:
|
||||
cookie->stream.lzma = (lzma_stream)LZMA_STREAM_INIT;
|
||||
lzrc = lzma_auto_decoder(&cookie->stream.lzma, -1, 0);
|
||||
if (lzrc != LZMA_OK) {
|
||||
log_err("Unable to initialize lzma_auto_decoder: %d", lzrc);
|
||||
return EIO;
|
||||
}
|
||||
cookie->stream.lzma.next_in = NULL;
|
||||
cookie->stream.lzma.avail_in = 0;
|
||||
cookie->stream.lzma.next_out = cookie->outbuf;
|
||||
cookie->stream.lzma.avail_out = sizeof cookie->outbuf;
|
||||
break;
|
||||
#endif
|
||||
default:
|
||||
log_err("Unsupported compression type: %d", cookie->ctype);
|
||||
return EINVAL;
|
||||
}
|
||||
|
||||
|
||||
cookie->outbuf_start = 0;
|
||||
cookie->eof = false;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
zfile_cookie_cleanup(struct zfile *cookie) {
|
||||
switch (cookie->ctype) {
|
||||
#ifdef HAVE_ZLIB_H
|
||||
case AG_GZIP:
|
||||
inflateEnd(&cookie->stream.gz);
|
||||
break;
|
||||
#endif
|
||||
#ifdef HAVE_LZMA_H
|
||||
case AG_XZ:
|
||||
lzma_end(&cookie->stream.lzma);
|
||||
break;
|
||||
#endif
|
||||
default:
|
||||
/* Compiler false positive - unreachable. */
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Open compressed file 'path' as a (forward-)seekable (and rewindable),
|
||||
* read-only stream.
|
||||
*/
|
||||
FILE *
|
||||
decompress_open(int fd, const char *mode, ag_compression_type ctype) {
|
||||
struct zfile *cookie;
|
||||
FILE *res, *in;
|
||||
int error;
|
||||
|
||||
cookie = NULL;
|
||||
in = res = NULL;
|
||||
if (strstr(mode, "w") || strstr(mode, "a")) {
|
||||
errno = EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
in = fdopen(fd, mode);
|
||||
if (in == NULL)
|
||||
goto out;
|
||||
|
||||
/*
|
||||
* No validation of compression type is done -- file is assumed to
|
||||
* match input. In Ag, the compression type is already detected, so
|
||||
* that's ok.
|
||||
*/
|
||||
cookie = malloc(sizeof *cookie);
|
||||
if (cookie == NULL) {
|
||||
errno = ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
cookie->in = in;
|
||||
cookie->logic_offset = 0;
|
||||
cookie->decode_offset = 0;
|
||||
cookie->ctype = ctype;
|
||||
|
||||
error = zfile_cookie_init(cookie);
|
||||
if (error != 0) {
|
||||
errno = error;
|
||||
goto out;
|
||||
}
|
||||
|
||||
res = fopencookie(cookie, mode, zfile_io);
|
||||
|
||||
out:
|
||||
if (res == NULL) {
|
||||
if (in != NULL)
|
||||
fclose(in);
|
||||
if (cookie != NULL)
|
||||
free(cookie);
|
||||
}
|
||||
return res;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return number of bytes into buf, 0 on EOF, -1 on error. Update stream
|
||||
* offset.
|
||||
*/
|
||||
static ssize_t
|
||||
zfile_read(void *cookie_, char *buf, size_t size) {
|
||||
struct zfile *cookie = cookie_;
|
||||
size_t nb, ignorebytes;
|
||||
ssize_t total = 0;
|
||||
lzma_ret lzret;
|
||||
int ret;
|
||||
|
||||
assert(size <= SSIZE_MAX);
|
||||
|
||||
if (size == 0)
|
||||
return 0;
|
||||
|
||||
if (cookie->eof)
|
||||
return 0;
|
||||
|
||||
ret = Z_OK;
|
||||
lzret = LZMA_OK;
|
||||
|
||||
ignorebytes = cookie->logic_offset - cookie->decode_offset;
|
||||
assert(ignorebytes == 0);
|
||||
|
||||
do {
|
||||
size_t inflated;
|
||||
|
||||
/* Drain output buffer first */
|
||||
while (CNEXT_OUT(cookie) >
|
||||
&cookie->outbuf[cookie->outbuf_start]) {
|
||||
size_t left = CNEXT_OUT(cookie) -
|
||||
&cookie->outbuf[cookie->outbuf_start];
|
||||
size_t ignoreskip = min(ignorebytes, left);
|
||||
size_t toread;
|
||||
|
||||
if (ignoreskip > 0) {
|
||||
ignorebytes -= ignoreskip;
|
||||
left -= ignoreskip;
|
||||
cookie->outbuf_start += ignoreskip;
|
||||
cookie->decode_offset += ignoreskip;
|
||||
}
|
||||
|
||||
// Ran out of output before we seek()ed up.
|
||||
if (ignorebytes > 0)
|
||||
break;
|
||||
|
||||
toread = min(left, size);
|
||||
memcpy(buf, &cookie->outbuf[cookie->outbuf_start],
|
||||
toread);
|
||||
|
||||
buf += toread;
|
||||
size -= toread;
|
||||
left -= toread;
|
||||
cookie->outbuf_start += toread;
|
||||
cookie->decode_offset += toread;
|
||||
cookie->logic_offset += toread;
|
||||
total += toread;
|
||||
|
||||
if (size == 0)
|
||||
break;
|
||||
}
|
||||
|
||||
if (size == 0)
|
||||
break;
|
||||
|
||||
/*
|
||||
* If we have not satisfied read, the output buffer must be
|
||||
* empty.
|
||||
*/
|
||||
assert(cookie->stream.gz.next_out ==
|
||||
&cookie->outbuf[cookie->outbuf_start]);
|
||||
|
||||
if ((cookie->ctype == AG_XZ && lzret == LZMA_STREAM_END) ||
|
||||
(cookie->ctype == AG_GZIP && ret == Z_STREAM_END)) {
|
||||
cookie->eof = true;
|
||||
break;
|
||||
}
|
||||
|
||||
/* Read more input if empty */
|
||||
if (CAVAIL_IN(cookie) == 0) {
|
||||
nb = fread(cookie->inbuf, 1, sizeof cookie->inbuf,
|
||||
cookie->in);
|
||||
if (ferror(cookie->in)) {
|
||||
warn("error read core");
|
||||
exit(1);
|
||||
}
|
||||
if (nb == 0 && feof(cookie->in)) {
|
||||
warn("truncated file");
|
||||
exit(1);
|
||||
}
|
||||
if (cookie->ctype == AG_XZ) {
|
||||
cookie->stream.lzma.avail_in = nb;
|
||||
cookie->stream.lzma.next_in = cookie->inbuf;
|
||||
} else {
|
||||
cookie->stream.gz.avail_in = nb;
|
||||
cookie->stream.gz.next_in = cookie->inbuf;
|
||||
}
|
||||
}
|
||||
|
||||
/* Reset stream state to beginning of output buffer */
|
||||
if (cookie->ctype == AG_XZ) {
|
||||
cookie->stream.lzma.next_out = cookie->outbuf;
|
||||
cookie->stream.lzma.avail_out = sizeof cookie->outbuf;
|
||||
} else {
|
||||
cookie->stream.gz.next_out = cookie->outbuf;
|
||||
cookie->stream.gz.avail_out = sizeof cookie->outbuf;
|
||||
}
|
||||
cookie->outbuf_start = 0;
|
||||
|
||||
if (cookie->ctype == AG_GZIP) {
|
||||
ret = inflate(&cookie->stream.gz, Z_NO_FLUSH);
|
||||
if (ret != Z_OK && ret != Z_STREAM_END) {
|
||||
log_err("Found mem/data error while decompressing zlib stream: %s", zError(ret));
|
||||
return -1;
|
||||
}
|
||||
} else {
|
||||
lzret = lzma_code(&cookie->stream.lzma, LZMA_RUN);
|
||||
if (lzret != LZMA_OK && lzret != LZMA_STREAM_END) {
|
||||
log_err("Found mem/data error while decompressing xz/lzma stream: %d", lzret);
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
inflated = CNEXT_OUT(cookie) - &cookie->outbuf[0];
|
||||
cookie->actual_len += inflated;
|
||||
} while (!ferror(cookie->in) && size > 0);
|
||||
|
||||
assert(total <= SSIZE_MAX);
|
||||
return total;
|
||||
}
|
||||
|
||||
static int
|
||||
zfile_seek(void *cookie_, off64_t *offset_, int whence) {
|
||||
struct zfile *cookie = cookie_;
|
||||
off64_t new_offset = 0, offset = *offset_;
|
||||
|
||||
if (whence == SEEK_SET) {
|
||||
new_offset = offset;
|
||||
} else if (whence == SEEK_CUR) {
|
||||
new_offset = (off64_t)cookie->logic_offset + offset;
|
||||
} else {
|
||||
/* SEEK_END not ok */
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (new_offset < 0)
|
||||
return -1;
|
||||
|
||||
/* Backward seeks to anywhere but 0 are not ok */
|
||||
if (new_offset < (off64_t)cookie->logic_offset && new_offset != 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (new_offset == 0) {
|
||||
/* rewind(3) */
|
||||
cookie->decode_offset = 0;
|
||||
cookie->logic_offset = 0;
|
||||
zfile_cookie_cleanup(cookie);
|
||||
zfile_cookie_init(cookie);
|
||||
} else if ((uint64_t)new_offset > cookie->logic_offset) {
|
||||
/* Emulate forward seek by skipping ... */
|
||||
char *buf;
|
||||
const size_t bsz = 32 * 1024;
|
||||
|
||||
buf = malloc(bsz);
|
||||
while ((uint64_t)new_offset > cookie->logic_offset) {
|
||||
size_t diff = min(bsz,
|
||||
(uint64_t)new_offset - cookie->logic_offset);
|
||||
ssize_t err = zfile_read(cookie_, buf, diff);
|
||||
if (err < 0) {
|
||||
free(buf);
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* Seek past EOF gets positioned at EOF */
|
||||
if (err == 0) {
|
||||
assert(cookie->eof);
|
||||
new_offset = cookie->logic_offset;
|
||||
break;
|
||||
}
|
||||
}
|
||||
free(buf);
|
||||
}
|
||||
|
||||
assert(cookie->logic_offset == (uint64_t)new_offset);
|
||||
|
||||
*offset_ = new_offset;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
zfile_close(void *cookie_) {
|
||||
struct zfile *cookie = cookie_;
|
||||
|
||||
zfile_cookie_cleanup(cookie);
|
||||
fclose(cookie->in);
|
||||
free(cookie);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* HAVE_FOPENCOOKIE */
|
||||
|
|
@ -15,7 +15,14 @@ Search a big file:
|
|||
234881024:hello7516192768
|
||||
268435456:hello
|
||||
|
||||
Fail to regex search a big file:
|
||||
Regex search a big file:
|
||||
|
||||
$ $TESTDIR/../../ag --nocolor --workers=1 --parallel 'hello.*' $TESTDIR/big_file.txt
|
||||
ERR: Skipping */big_file.txt: pcre_exec() can't handle files larger than 2147483647 bytes. (glob)
|
||||
[1]
|
||||
33554432:hello1073741824
|
||||
67108864:hello2147483648
|
||||
100663296:hello3221225472
|
||||
134217728:hello4294967296
|
||||
167772160:hello5368709120
|
||||
201326592:hello6442450944
|
||||
234881024:hello7516192768
|
||||
268435456:hello
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ Ensure column is correct:
|
|||
# Test ackmate output. Not quite right, but at least offsets are in the
|
||||
# ballpark instead of being 9 quintillion
|
||||
|
||||
$ ag --ackmate "lah\nb"
|
||||
$ ag --ackmate "blah\nb"
|
||||
:blah.txt
|
||||
1;blah
|
||||
2;1 5:blah2
|
||||
2;0 6:blah2
|
||||
|
|
|
|||
|
|
@ -1,9 +0,0 @@
|
|||
Setup:
|
||||
|
||||
$ . $TESTDIR/setup.sh
|
||||
$ printf "hello world\n" >test.txt
|
||||
|
||||
Verify ag runs with an empty environment:
|
||||
|
||||
$ env -i $TESTDIR/../ag --noaffinity --nocolor --workers=1 --parallel hello
|
||||
test.txt:1:hello world
|
||||
|
|
@ -13,10 +13,10 @@ A genuine zero-length match should succeed:
|
|||
1:foo
|
||||
|
||||
Empty files should be listed with --unrestricted --files-with-matches (-ul)
|
||||
$ ag -lu --stats | sed '$d' | sort # Remove the last line about timing which will differ
|
||||
2 files contained matches
|
||||
2 files searched
|
||||
2 matches
|
||||
4 bytes searched
|
||||
$ ag -lu --stats | sed '$d' # Remove the last line about timing which will differ
|
||||
empty.txt
|
||||
nonempty.txt
|
||||
2 matches
|
||||
2 files contained matches
|
||||
2 files searched
|
||||
4 bytes searched
|
||||
|
|
|
|||
|
|
@ -3,10 +3,6 @@ Setup:
|
|||
$ . $TESTDIR/setup.sh
|
||||
$ printf 'foo\n' > ./foo.txt
|
||||
$ printf 'bar\n' > ./bar.txt
|
||||
$ printf 'foo\nbar\nbaz\n' > ./baz.txt
|
||||
$ printf 'duck\nanother duck\nyet another duck\n' > ./duck.txt
|
||||
$ cp duck.txt goose.txt
|
||||
$ echo "GOOSE!!!" >> ./goose.txt
|
||||
|
||||
Files with matches:
|
||||
|
||||
|
|
@ -16,17 +12,8 @@ Files with matches:
|
|||
foo.txt
|
||||
$ ag --files-with-matches foo bar.txt
|
||||
[1]
|
||||
$ ag --files-with-matches foo foo.txt bar.txt baz.txt
|
||||
foo.txt
|
||||
baz.txt
|
||||
$ ag --files-with-matches bar foo.txt bar.txt baz.txt
|
||||
bar.txt
|
||||
baz.txt
|
||||
$ ag --files-with-matches foo bar.txt baz.txt
|
||||
baz.txt
|
||||
|
||||
Files without matches:
|
||||
(Prints names of files in which no line matches query)
|
||||
|
||||
$ ag --files-without-matches bar foo.txt
|
||||
foo.txt
|
||||
|
|
@ -34,30 +21,3 @@ Files without matches:
|
|||
foo.txt
|
||||
$ ag --files-without-matches bar bar.txt
|
||||
[1]
|
||||
$ ag --files-without-matches foo foo.txt bar.txt baz.txt
|
||||
bar.txt
|
||||
$ ag --files-without-matches bar foo.txt bar.txt baz.txt
|
||||
foo.txt
|
||||
|
||||
Files with inverted matches:
|
||||
(Prints names of files in which some line doesn't match query)
|
||||
|
||||
$ ag --files-with-matches --invert-match bar bar.txt
|
||||
[1]
|
||||
$ ag --files-with-matches --invert-match foo foo.txt bar.txt baz.txt
|
||||
bar.txt
|
||||
baz.txt
|
||||
$ ag --files-with-matches --invert-match bar foo.txt bar.txt baz.txt
|
||||
foo.txt
|
||||
baz.txt
|
||||
|
||||
Files without inverted matches:
|
||||
(Prints names of files in which no line doesn't match query,
|
||||
i.e. where every line matches query)
|
||||
|
||||
$ ag --files-without-matches --invert-match duck duck.txt
|
||||
duck.txt
|
||||
$ ag --files-without-matches --invert-match duck goose.txt
|
||||
[1]
|
||||
$ ag --files-without-matches --invert-match duck duck.txt goose.txt
|
||||
duck.txt
|
||||
|
|
|
|||
|
|
@ -6,12 +6,10 @@ Setup:
|
|||
$ printf 'targetA\n' > something.js
|
||||
$ printf 'targetB\n' > aFile.test.txt
|
||||
$ printf 'targetC\n' > aFile.txt
|
||||
$ printf 'targetG\n' > something.min.js
|
||||
$ mkdir -p subdir
|
||||
$ printf 'targetD\n' > subdir/somethingElse.js
|
||||
$ printf 'targetE\n' > subdir/anotherFile.test.txt
|
||||
$ printf 'targetF\n' > subdir/anotherFile.txt
|
||||
$ printf 'targetH\n' > subdir/somethingElse.min.js
|
||||
|
||||
Ignore patterns with single extension in root directory:
|
||||
|
||||
|
|
@ -23,11 +21,6 @@ Ignore patterns with multiple extensions in root directory:
|
|||
$ ag "targetB"
|
||||
[1]
|
||||
|
||||
*.js ignores *.min.js in root directory:
|
||||
|
||||
$ ag "targetG"
|
||||
[1]
|
||||
|
||||
Do not ignore patterns with partial extensions in root directory:
|
||||
|
||||
$ ag "targetC"
|
||||
|
|
@ -43,11 +36,6 @@ Ignore patterns with multiple extensions in subdirectory:
|
|||
$ ag "targetE"
|
||||
[1]
|
||||
|
||||
*.js ignores *.min.js in subdirectory:
|
||||
|
||||
$ ag "targetH"
|
||||
[1]
|
||||
|
||||
Do not ignore patterns with partial extensions in subdirectory:
|
||||
|
||||
$ ag "targetF"
|
||||
|
|
|
|||
|
|
@ -1,12 +0,0 @@
|
|||
Setup:
|
||||
|
||||
$ . $TESTDIR/setup.sh
|
||||
$ printf 'blah1\n' > ./printme.txt
|
||||
$ printf 'blah2\n' > ./dontprintme.c
|
||||
$ printf '*\n' > ./.ignore
|
||||
$ printf '!*.txt\n' >> ./.ignore
|
||||
|
||||
Ignore .gitignore patterns but not .ignore patterns:
|
||||
|
||||
$ ag blah
|
||||
printme.txt:1:blah1
|
||||
|
|
@ -1,19 +0,0 @@
|
|||
Setup:
|
||||
|
||||
$ . $TESTDIR/setup.sh
|
||||
$ mkdir -p subdir/ignoredir
|
||||
$ mkdir ignoredir
|
||||
$ printf 'match1\n' > subdir/ignoredir/file1.txt
|
||||
$ printf 'match1\n' > ignoredir/file1.txt
|
||||
$ printf '/ignoredir\n' > subdir/.ignore
|
||||
|
||||
Ignore file in subdir/ignoredir, but not in ignoredir:
|
||||
|
||||
$ ag match
|
||||
ignoredir/file1.txt:1:match1
|
||||
|
||||
From subdir, ignore file in subdir/ignoredir:
|
||||
|
||||
$ cd subdir
|
||||
$ ag match
|
||||
[1]
|
||||
|
|
@ -12,30 +12,18 @@ Language types are output:
|
|||
--ada
|
||||
.ada .adb .ads
|
||||
|
||||
--asciidoc
|
||||
.adoc .ad .asc .asciidoc
|
||||
|
||||
--apl
|
||||
.apl
|
||||
|
||||
--asm
|
||||
.asm .s
|
||||
|
||||
--asp
|
||||
.asp .asa .aspx .asax .ashx .ascx .asmx
|
||||
|
||||
--aspx
|
||||
.asp .asa .aspx .asax .ashx .ascx .asmx
|
||||
|
||||
--batch
|
||||
.bat .cmd
|
||||
|
||||
--bazel
|
||||
.bazel
|
||||
|
||||
--bitbake
|
||||
.bb .bbappend .bbclass .inc
|
||||
|
||||
--bro
|
||||
.bro .bif
|
||||
|
||||
--cc
|
||||
.c .h .xs
|
||||
|
||||
|
|
@ -46,17 +34,11 @@ Language types are output:
|
|||
.chpl
|
||||
|
||||
--clojure
|
||||
.clj .cljs .cljc .cljx .edn
|
||||
.clj .cljs .cljc .cljx
|
||||
|
||||
--coffee
|
||||
.coffee .cjsx
|
||||
|
||||
--config
|
||||
.config
|
||||
|
||||
--coq
|
||||
.coq .g .v
|
||||
|
||||
--cpp
|
||||
.cpp .cc .C .cxx .m .hpp .hh .h .H .hxx .tpp
|
||||
|
||||
|
|
@ -66,9 +48,6 @@ Language types are output:
|
|||
--csharp
|
||||
.cs
|
||||
|
||||
--cshtml
|
||||
.cshtml
|
||||
|
||||
--css
|
||||
.css
|
||||
|
||||
|
|
@ -78,15 +57,6 @@ Language types are output:
|
|||
--delphi
|
||||
.pas .int .dfm .nfm .dof .dpk .dpr .dproj .groupproj .bdsgroup .bdsproj
|
||||
|
||||
--dlang
|
||||
.d .di
|
||||
|
||||
--dot
|
||||
.dot .gv
|
||||
|
||||
--dts
|
||||
.dts .dtsi
|
||||
|
||||
--ebuild
|
||||
.ebuild .eclass
|
||||
|
||||
|
|
@ -96,9 +66,6 @@ Language types are output:
|
|||
--elixir
|
||||
.ex .eex .exs
|
||||
|
||||
--elm
|
||||
.elm
|
||||
|
||||
--erlang
|
||||
.erl .hrl
|
||||
|
||||
|
|
@ -106,7 +73,7 @@ Language types are output:
|
|||
.factor
|
||||
|
||||
--fortran
|
||||
.f .F .f77 .f90 .F90 .f95 .f03 .for .ftn .fpp .FPP
|
||||
.f .f77 .f90 .f95 .f03 .for .ftn .fpp
|
||||
|
||||
--fsharp
|
||||
.fs .fsi .fsx
|
||||
|
|
@ -120,23 +87,14 @@ Language types are output:
|
|||
--go
|
||||
.go
|
||||
|
||||
--gradle
|
||||
.gradle
|
||||
|
||||
--groovy
|
||||
.groovy .gtmpl .gpp .grunit .gradle
|
||||
.groovy .gtmpl .gpp .grunit
|
||||
|
||||
--haml
|
||||
.haml
|
||||
|
||||
--handlebars
|
||||
.hbs
|
||||
|
||||
--haskell
|
||||
.hs .hsig .lhs
|
||||
|
||||
--haxe
|
||||
.hx
|
||||
.hs .lhs
|
||||
|
||||
--hh
|
||||
.h
|
||||
|
|
@ -144,38 +102,23 @@ Language types are output:
|
|||
--html
|
||||
.htm .html .shtml .xhtml
|
||||
|
||||
--idris
|
||||
.idr .ipkg .lidr
|
||||
|
||||
--ini
|
||||
.ini
|
||||
|
||||
--ipython
|
||||
.ipynb
|
||||
|
||||
--isabelle
|
||||
.thy
|
||||
|
||||
--j
|
||||
.ijs
|
||||
|
||||
--jade
|
||||
.jade
|
||||
|
||||
--java
|
||||
.java .properties
|
||||
|
||||
--jinja2
|
||||
.j2
|
||||
|
||||
--js
|
||||
.es6 .js .jsx .vue
|
||||
.js .jsx .vue
|
||||
|
||||
--json
|
||||
.json
|
||||
|
||||
--jsp
|
||||
.jsp .jspx .jhtm .jhtml .jspf .tag .tagf
|
||||
.jsp .jspx .jhtm .jhtml
|
||||
|
||||
--julia
|
||||
.jl
|
||||
|
|
@ -192,9 +135,6 @@ Language types are output:
|
|||
--lisp
|
||||
.lisp .lsp
|
||||
|
||||
--log
|
||||
.log
|
||||
|
||||
--lua
|
||||
.lua
|
||||
|
||||
|
|
@ -219,21 +159,12 @@ Language types are output:
|
|||
--mathematica
|
||||
.m .wl
|
||||
|
||||
--md
|
||||
.markdown .mdown .mdwn .mkdn .mkd .md
|
||||
|
||||
--mercury
|
||||
.m .moo
|
||||
|
||||
--naccess
|
||||
.asa .rsa
|
||||
|
||||
--nim
|
||||
.nim
|
||||
|
||||
--nix
|
||||
.nix
|
||||
|
||||
--objc
|
||||
.m .h
|
||||
|
||||
|
|
@ -246,15 +177,9 @@ Language types are output:
|
|||
--octave
|
||||
.m
|
||||
|
||||
--org
|
||||
.org
|
||||
|
||||
--parrot
|
||||
.pir .pasm .pmc .ops .pod .pg .tg
|
||||
|
||||
--pdb
|
||||
.pdb
|
||||
|
||||
--perl
|
||||
.pl .pm .pm6 .pod .t
|
||||
|
||||
|
|
@ -264,24 +189,12 @@ Language types are output:
|
|||
--pike
|
||||
.pike .pmod
|
||||
|
||||
--plist
|
||||
.plist
|
||||
|
||||
--plone
|
||||
.pt .cpt .metadata .cpy .py .xml .zcml
|
||||
|
||||
--powershell
|
||||
.ps1
|
||||
|
||||
--proto
|
||||
.proto
|
||||
|
||||
--ps1
|
||||
.ps1
|
||||
|
||||
--pug
|
||||
.pug
|
||||
|
||||
--puppet
|
||||
.pp
|
||||
|
||||
|
|
@ -297,9 +210,6 @@ Language types are output:
|
|||
--rake
|
||||
.Rakefile
|
||||
|
||||
--razor
|
||||
.cshtml
|
||||
|
||||
--restructuredtext
|
||||
.rst
|
||||
|
||||
|
|
@ -307,7 +217,7 @@ Language types are output:
|
|||
.rs
|
||||
|
||||
--r
|
||||
.r .R .Rmd .Rnw .Rtex .Rrst
|
||||
.R .Rmd .Rnw .Rtex .Rrst
|
||||
|
||||
--rdoc
|
||||
.rdoc
|
||||
|
|
@ -342,9 +252,6 @@ Language types are output:
|
|||
--sql
|
||||
.sql .ctl
|
||||
|
||||
--stata
|
||||
.do .ado
|
||||
|
||||
--stylus
|
||||
.styl
|
||||
|
||||
|
|
@ -354,18 +261,9 @@ Language types are output:
|
|||
--tcl
|
||||
.tcl .itcl .itk
|
||||
|
||||
--terraform
|
||||
.tf .tfvars
|
||||
|
||||
--tex
|
||||
.tex .cls .sty
|
||||
|
||||
--thrift
|
||||
.thrift
|
||||
|
||||
--tla
|
||||
.tla
|
||||
|
||||
--tt
|
||||
.tt .tt2 .ttml
|
||||
|
||||
|
|
@ -375,9 +273,6 @@ Language types are output:
|
|||
--ts
|
||||
.ts .tsx
|
||||
|
||||
--twig
|
||||
.twig
|
||||
|
||||
--vala
|
||||
.vala .vapi
|
||||
|
||||
|
|
@ -388,7 +283,7 @@ Language types are output:
|
|||
.vm .vtl .vsl
|
||||
|
||||
--verilog
|
||||
.v .vh .sv .svh
|
||||
.v .vh .sv
|
||||
|
||||
--vhdl
|
||||
.vhd .vhdl
|
||||
|
|
@ -396,9 +291,6 @@ Language types are output:
|
|||
--vim
|
||||
.vim
|
||||
|
||||
--vue
|
||||
.vue
|
||||
|
||||
--wix
|
||||
.wxi .wxs
|
||||
|
||||
|
|
@ -409,14 +301,8 @@ Language types are output:
|
|||
.wadl
|
||||
|
||||
--xml
|
||||
.xml .dtd .xsl .xslt .xsd .ent .tld .plist .wsdl
|
||||
.xml .dtd .xsl .xslt .ent .tld
|
||||
|
||||
--yaml
|
||||
.yaml .yml
|
||||
|
||||
--zeek
|
||||
.zeek .bro .bif
|
||||
|
||||
--zephir
|
||||
.zep
|
||||
|
||||
|
|
@ -1,16 +0,0 @@
|
|||
Setup:
|
||||
|
||||
$ . $TESTDIR/setup.sh
|
||||
$ printf 'foo\n' > ./foo.txt
|
||||
$ printf 'bar\n' > ./bar.txt
|
||||
$ printf 'baz\n' > ./baz.txt
|
||||
|
||||
All files:
|
||||
|
||||
$ ag --print-all-files --group foo | sort
|
||||
|
||||
|
||||
1:foo
|
||||
bar.txt
|
||||
baz.txt
|
||||
foo.txt
|
||||
|
|
@ -1,5 +1,5 @@
|
|||
%define _bashcompdir %_sysconfdir/bash_completion.d
|
||||
%define _zshcompdir %{_datadir}/zsh/site-functions
|
||||
|
||||
|
||||
Name: the_silver_searcher
|
||||
Version: @VERSION@
|
||||
|
|
@ -12,8 +12,8 @@ URL: https://github.com/ggreer/%{name}
|
|||
Source0: https://github.com/downloads/ggreer/%{name}/%{name}-%{version}.tar.gz
|
||||
BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XXXXXX)
|
||||
|
||||
BuildRequires: pcre-devel, xz-devel, zlib-devel
|
||||
Requires: pcre, xz, zlib
|
||||
BuildRequires: pcre2-devel, xz-devel, zlib-devel
|
||||
Requires: pcre2, xz, zlib
|
||||
|
||||
%description
|
||||
The Silver Searcher
|
||||
|
|
@ -29,7 +29,7 @@ How is it so fast?
|
|||
* Searching for literals (no regex) uses Boyer-Moore-Horspool strstr.
|
||||
* Files are mmap()ed instead of read into a buffer.
|
||||
* If you're building with PCRE 8.21 or greater, regex searches use the JIT compiler.
|
||||
* Ag calls pcre_study() before executing the regex on a jillion files.
|
||||
* Ag calls pcre2_study() before executing the regex on a jillion files.
|
||||
* Instead of calling fnmatch() on every pattern in your ignore files, non-regex patterns are loaded into an array and binary searched.
|
||||
* Ag uses Pthreads to take advantage of multiple CPU cores and search files in parallel.
|
||||
|
||||
|
|
@ -62,7 +62,7 @@ rm -rf ${RPM_BUILD_ROOT}
|
|||
%{_mandir}/*
|
||||
%config %{_bashcompdir}/ag.bashcomp.sh
|
||||
%config %{_datadir}/%{name}/completions/ag.bashcomp.sh
|
||||
%config %{_datadir}/zsh/site-functions/_the_silver_searcher
|
||||
|
||||
|
||||
%changelog
|
||||
* Thu Dec 5 2013 Emily Strickland <code@emily.st> - 0.18.1-1
|
||||
|
|
|
|||
Loading…
Reference in a new issue