Motherboard: Asus Pro WS WRX80E-SAGE SE WIFI
Card: Asus HYPER M.2 X16 GEN 4 CARD
NVMe: 4x Samsung SSD 980 PRO 1TB
OS: Linux fedora 5.16.12-200.fc35.x86_64
AER, advanced error reporting logs excessively:
/* | |
* SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
* SPDX-License-Identifier: Apache-2.0 | |
* | |
* Licensed under the Apache License, Version 2.0 (the "License"); | |
* you may not use this file except in compliance with the License. | |
* You may obtain a copy of the License at | |
* | |
* http://www.apache.org/licenses/LICENSE-2.0 | |
* |
# Apply this config conditionally to all C files | |
If: | |
PathMatch: .*\.(c|h)$ | |
CompileFlags: | |
Compiler: /usr/bin/gcc | |
--- | |
# Apply this config conditionally to all C++ files | |
If: |
# Store interactive Python shell history in ~/.cache/python_history | |
# instead of ~/.python_history. | |
# | |
# Create the following .config/pythonstartup.py file | |
# and export its path using PYTHONSTARTUP environment variable: | |
# | |
# export PYTHONSTARTUP="${XDG_CONFIG_HOME:-$HOME/.config}/pythonstartup.py" | |
import atexit | |
import os |
import atexit | |
import ctypes | |
import os | |
import shlex | |
import sys | |
import tempfile | |
CMD_C_TO_SO = '{compiler} -shared -o {output} {input} {libraries}' | |
# /------------------------------------------\ | |
# | don't forget to download the .tp file | | |
# | and place it in the user's directory :› | | |
# | | | |
# | also install lolcat: | | |
# | https://github.com/busyloop/lolcat | | |
# \------------------------------------------/ | |
alias test-passed='if [ "$?" -eq "0" ]; then lolcat ~/.tp -a -s 40 -d 2; fi;' |
/* Copyright (c) 2018 Arvid Gerstmann. */ | |
/* This code is licensed under MIT license. */ | |
#ifndef AG_RANDOM_H | |
#define AG_RANDOM_H | |
class splitmix | |
{ | |
public: | |
using result_type = uint32_t; | |
static constexpr result_type (min)() { return 0; } |
This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).
Matrix multiplication is a mathematical operation that defines the product of
#!/usr/bin/env bash | |
# --slave /usr/bin/$1 $1 /usr/bin/$1-\${version} \\ | |
function register_clang_version { | |
local version=$1 | |
local priority=$2 | |
update-alternatives \ | |
--install /usr/bin/llvm-config llvm-config /usr/bin/llvm-config-${version} ${priority} \ |
from timeit import default_timer as time | |
import numpy as np | |
from numba import cuda | |
import os | |
os.environ['NUMBAPRO_LIBDEVICE']='/usr/lib/nvidia-cuda-toolkit/libdevice/' | |
os.environ['NUMBAPRO_NVVM']='/usr/lib/x86_64-linux-gnu/libnvvm.so.3.1.0' | |
import numpy | |
import torch | |
import ctypes |