Skip to content

Instantly share code, notes, and snippets.

@timvisee
Last active May 16, 2023 13:42

Revisions

  1. timvisee revised this gist May 16, 2023. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion qdrant-prime-mmap-benchmark.md
    Original file line number Diff line number Diff line change
    @@ -33,7 +33,7 @@ Request:
    ## Results

    - **Cold** means all (disk) caches are purged.
    - **Warm** means disk cache is still available from a previous run.
    - **Hot** means disk cache is still available from a previous run.

    ### Normal

  2. timvisee revised this gist May 16, 2023. 1 changed file with 11 additions and 0 deletions.
    11 changes: 11 additions & 0 deletions qdrant-prime-mmap-benchmark.md
    Original file line number Diff line number Diff line change
    @@ -5,6 +5,9 @@ Machine:
    - 32GB RAM
    - Swap disabled

    Code:
    - https://github.com/qdrant/qdrant/commit/6bd752be8d85a375d0af6984bbb825c5b6497496

    Collection:
    - 10_000_000 vectors
    - 512 dimensions
    @@ -34,6 +37,8 @@ Request:

    ### Normal

    `./qdrant`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    @@ -49,6 +54,8 @@ Not having mmap pages ready in cache adds ~45s.

    ### With `MADV_WILLNEED`

    `MADVISE_WILL_NEED=1 ./qdrant`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    @@ -64,6 +71,8 @@ No visible improvement. This doesn't pre-fault all mmap pages.

    ### With `MADV_WILLNEED` and read first byte

    `MADVISE_WILL_NEED=1 MADVISE_READ_BYTE=1 ./qdrant`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    @@ -80,6 +89,8 @@ reading the first byte from the first page.

    ### With `MAP_POPULATE`

    `MMAP_POPULATE=1 ./qdrant`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|14s|6s|
  3. timvisee created this gist May 16, 2023.
    99 changes: 99 additions & 0 deletions qdrant-prime-mmap-benchmark.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,99 @@
    # Simple mmap benchmark

    Machine:
    - Linux 6.2
    - 32GB RAM
    - Swap disabled

    Collection:
    - 10_000_000 vectors
    - 512 dimensions
    - 20.9GB on disk
    - Using mmap (threshold 1000)
    - No index

    Request:
    - Search:
    ```
    POST /collections/test/points/search?exact=true
    {
    "limit": 1000,
    "vector": [
    -0.00022172113,
    -0.0005458312,
    ...
    ]
    }
    ```

    ## Results

    - **Cold** means all (disk) caches are purged.
    - **Warm** means disk cache is still available from a previous run.

    ### Normal

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    |- VIRT|29.6G|29.6G|
    |- RES|1417M|1439M|
    |- SHR|68K|68K|
    |**First search**|44.35s|433ms|
    |- RES|20.9G|20.9G|
    |- SHR|19.5G|19.5G|
    |**Second search**|433ms|498ms|

    Not having mmap pages ready in cache adds ~45s.

    ### With `MADV_WILLNEED`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    |- VIRT|29.6G|29.6G|
    |- RES|1438M|1439M|
    |- SHR|68K|68K|
    |**First search**|47.11s|538ms|
    |- RES|20.9G|20.9G|
    |- SHR|19.5G|19.5G|
    |**Second search**|462ms|428ms|

    No visible improvement. This doesn't pre-fault all mmap pages.

    ### With `MADV_WILLNEED` and read first byte

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|5s|5s|
    |- VIRT|29.6G|29.6G|
    |- RES|1417M|1437M|
    |- SHR|68K|69K|
    |**First search**|46.88s|575ms|
    |- RES|20.9G|20.9G|
    |- SHR|19.5G|19.5G|
    |**Second search**|461ms|463ms|

    No visible improvement. This doesn't pre-fault all mmap pages, not even when
    reading the first byte from the first page.

    ### With `MAP_POPULATE`

    | | **Cold** | **Hot** |
    |---:|---|---|
    |**Startup**|14s|6s|
    |- VIRT|29.6G|29.6G|
    |- RES|20.9G|20.9G|
    |- SHR|19.5G|19.5G|
    |**First search**|457ms|449ms|
    |- RES|20.9G|20.9G|
    |- SHR|19.5G|19.5G|
    |**Second search**|414ms|425ms|

    Populating does properly pre-fault all all mmap pages, but this is blocking, and
    significantly increases the startup time and the time to first response.
    Populating only works on Linux.

    Populating adds 9s to the startup time, but removes 45s from the first search
    request.