Коrd Campbell kordless

Repetition Detection Suite

A comprehensive Python library for detecting repetitive text patterns in real-time streams, designed to catch runaway AI generation, infinite loops, and other repetitive content scenarios.

🚀 Quick Start

from repetition_detector import UnifiedRepetitionDetector

# Create detector

Nanonets OCR Setup Guide

This guide will help you set up and run the Nanonets OCR model locally for PDF processing.

GitHub Gist: https://gist.github.com/kordless/652234bf0b32b02e39cef32c71e03400

Quick Start (Complete Setup)

# 0. Create and activate conda environment first

To run the Python script for splitting a PDF into segments of just under 25MB each, you'll need to follow these steps:

Prerequisites

Python Installation: Ensure that Python is installed on your system. If not, you can download and install it from python.org.

PyPDF2 Library: The script uses the PyPDF2 library. You can install it using pip, Python's package installer. If pip is not already installed, it comes bundled with Python 3.4 and later versions.

Installation Steps

Open Terminal or Command Prompt: On Windows, you can open Command Prompt by searching for cmd in the Start menu.

Example of using OpenAI functions in completions with Python decorators

This example illustrates a way to utilize a function dynamically while querying an OpenAI GPT model. It uses the newly released functions support in the completion endpoints OpenAI provides.

The general concept is based on using a decorator to extract information from a function so it can be presented to the language model for use, and then pass the result of that function back to the completion endpoint for language augmentation.

In general, a wide variety of functions can be swapped in for use by the model. By changing the get_top_stories function, plus the prompt in run_conversation, you should be able to get the model to run your function without changing any of the other code.

Configuration

To use this, create a config.py file and add a variable with your OpenAI token:

	import sys
	import os
	import logging
	import requests
	import re
	import json
	from typing import Dict, Any, Optional, List, Tuple
	from mcp.server.fastmcp import FastMCP, Context
	from urllib.parse import urlparse, urljoin
	import asyncio

	"""
	Adaptive Connector Framework (ACF)

	A self-bootstrapping alternative to MCP that dynamically builds and tests
	connectors based on current needs. The system evolves its own capabilities
	through iterative learning and testing.

	Key components:
	1. Registry - Manages available connectors and their capabilities
	2. Connector Builder - Dynamically creates new connectors

	import os
	import sys
	from dotenv import load_dotenv, set_key
	from langchain_openai import ChatOpenAI
	from langchain.prompts import ChatPromptTemplate, PromptTemplate
	from langchain_core.runnables import RunnableSequence
	from langchain.tools import Tool
	from langchain.agents import create_react_agent, AgentExecutor
	from langchain.schema import HumanMessage
	import getpass

	# tokens from https://cloud.featurebase.com/configuration/api-keys
	featurebase_token = "<token>"

	# featurebase ($300 free credit on signup)
	# https://query.featurebase.com/v2/databases/bc355-t-t-t-362c1416/query/sql (but remove /query/sql)
	featurebase_endpoint = "https://query.featurebase.com/v2/databases/<uuid-only-no-query-sql>"

	"""
	Hacker News Top Stories
	Author:
	Date: June 12, 2023

	Description:
	This script fetches the top 10 stories from Hacker News using Algolia's search API. It retrieves the stories posted within the last 24 hours and prints their titles and URLs.

	Dependencies:
	- requests: HTTP library for sending API requests

	import openai
	import numpy as np
	from openai.embeddings_utils import get_embedding

	openai.api_key = "TOKEN"

	def gpt3_embedding(content, engine='text-similarity-ada-001'):
	content = content.encode(encoding='ASCII',errors='ignore').decode()
	response = openai.Embedding.create(input=content,engine=engine)
	vector = response['data'][0]['embedding'] # this is a normal list