ussage:
./analyze.sh SIVA_BUCKETS_PATH TIMEOUT_BUCKET TIMEOUT_SIVA
where:
SIVA_BUCKETS_PATH
is where siva buckets are stored, example:~/repos/pga/siva/latest
TIMEOUT_BUCKET
is 10 times the limit of seconds beyond a bucket will be marked as "needed to be reviewed", example: id set to60
, it will wait6sec
for each bucket to be initialized.TIMEOUT_SIVA
is 10 times the limit of seconds beyond a siva file will be marked as "slow", example: id set to20
, it will wait2sec
for each bucket to be initialized.
In the first iteration, it tries to initialize each bucked under SIVA_BUCKETS_PATH
and if it could not be
initialized in less than TIMEOUT_BUCKET/10
seconds, it will be marked as "needed to be reviewed".
In the second iteration, it will take all buckets marked as "needed to be reviewed" from the previous iteration,
trying each siva
file inside it, and adding to error.log
those that needed more than TIMEOUT_SIVA/10
seconds
to be initialized.
dependencies:
gitbase
that can be installed with go get -u -v github.com/src-d/gitbase/cmd/gitbase
example:
./analyze.sh ~/repos/pga/siva/latest 60 20
performed well for a PGA subset containing 9k repositories, finding ~10 slow repos in 10 minutes; Once they were deleted from the subset, source{d} Engine was able to run queries over it.