aws_ai_practitioner.md

AWS AI Practitioner

Introduction to Artificial Intelligence (AI)
- Definition:
  - AI refers to the simulation of human intelligence in machines designed to think and act like humans.
  - Capabilities include understanding natural language, recognizing patterns, solving problems, and making decisions.
- Examples of AI Applications:
  - Personal Assistants:
    - Virtual assistants like Alexa and Siri that perform tasks such as setting reminders, answering questions, and controlling smart home devices.
  - Fraud Detection:
    - Systems designed to identify and prevent fraudulent activities by analyzing transaction data in real-time to detect anomalies.
  - Medical Imaging:
    - AI applications that analyze medical images like X-rays, MRIs, and CT scans to assist in diagnosis and treatment planning.
  - Manufacturing:
    - Uses AI for quality control by identifying defects in products and for predictive maintenance to anticipate equipment failures.
  - Customer Support:
    - Automated chatbots that handle customer queries and provide product recommendations.
  - Predictive Analytics:
    - Utilizing historical data to forecast future trends and demands, aiding in strategic planning and decision-making.
- Key Concepts:
  - Machine Learning (ML): A subset of AI that involves algorithms learning from data to make decisions without explicit programming.
  - Deep Learning: A further subset of ML that uses neural networks with many layers (deep networks) to analyze complex data patterns.
  - Generative AI: A branch of AI that focuses on creating new content, such as text, images, or code, by learning from existing data, often using models like neural networks.
Machine Learning (ML)
- Definition:
  - ML is a method of data analysis that automates the building of analytical models based on the idea that systems can learn from data, identify patterns, and make decisions with minimal human intervention.
- Types of Data:
  - Structured Data:
    - Data that is organized in a defined manner (e.g., databases, spreadsheets).
    - Examples: Sales data, customer information.
  - Semi-Structured Data:
    - Partially organized data that doesn't fit into a relational database but has some organizational properties.
    - Examples: JSON files, XML documents.
  - Unstructured Data:
    - Data that does not have a predefined structure.
    - Examples: Text data (emails, social media posts), images, videos.
- Training Process:
  - Involves feeding large amounts of data to an algorithm so that it can learn to make predictions or decisions.
  - Algorithms:
    - Mathematical models that process the data to find patterns or relationships.
  - Features:
    - Measurable properties or characteristics of the data used as input for algorithms.
  - Inference:
    - The process of using the trained model to make predictions on new, unseen data.
- Machine Learning Styles:
  - Supervised Learning:
    - Trains on labeled data where the output is known.
    - Examples:
      - Image Classification: Identifying objects within images.
      - Spam Detection: Classifying emails as spam or not spam.
  - Unsupervised Learning:
    - Trains on unlabeled data to find hidden patterns.
    - Examples:
      - Clustering Analysis: Grouping data points based on similarity (e.g., customer segmentation).
      - Anomaly Detection: Identifying unusual data points (e.g., fraud detection).
  - Reinforcement Learning:
    - Trains an agent to make decisions through trial and error by receiving rewards or penalties.
    - Examples:
      - Game Playing: Training AI to play games like chess or Go.
      - Robotics: Teaching robots to navigate environments.
Deep Learning
- Definition:
  - A subset of machine learning that utilizes neural networks with multiple layers (hence "deep") to model complex patterns in large datasets.
- Neural Network Structure:
  - Input Layer:
    - Receives the initial data (e.g., pixels in an image, words in a sentence).
  - Hidden Layers:
    - Multiple layers where the data is processed. Each layer extracts features and passes them to the next layer.
    - Types of Layers:
      - Dense (Fully Connected) Layers: Every neuron is connected to every neuron in the next layer.
      - Convolutional Layers: Used primarily in image processing to detect spatial hierarchies.
      - Recurrent Layers: Used in sequence data to remember previous inputs (e.g., LSTMs for text).
  - Output Layer:
    - Produces the final prediction or classification (e.g., classifying an image as a dog or cat).
- Applications:
  - Image Classification:
    - Identifying objects within images, such as recognizing different species of animals in photos.
  - Natural Language Processing (NLP):
    - Understanding and generating human language, such as translating languages or summarizing text.
- Deep Learning vs. Traditional ML:
  - Data Type:
    - Traditional ML: Structured and labeled data.
    - Deep Learning: Unstructured data like images, text, and audio.
  - Feature Extraction:
    - Traditional ML: Requires manual feature selection and extraction.
    - Deep Learning: Automatically extracts features from raw data.
  - Computation Cost:
    - Traditional ML: Generally lower computational cost.
    - Deep Learning: Higher computational cost due to large datasets and complex models.
  - Use Cases:
    - Traditional ML: Predictive analytics, classification, recommendation.
    - Deep Learning: Image recognition, speech recognition, language translation.
Generative AI
- Definition:
  - Refers to models that generate new content based on training data.
- Techniques:
  - Transformers:
    - A type of model architecture that processes sequences of data (e.g., sentences) in parallel, making them efficient for training on large datasets.
    - Components of Transformers:
      - Self-Attention Mechanism: Weighs the importance of different parts of the input when generating output.
      - Encoder-Decoder Architecture: Consists of encoder layers to process input and decoder layers to generate output.
      - Positional Encoding: Encodes the relative position of each token in a sequence to preserve order.
- Applications:
  - Content Creation:
    - Writing articles, generating images, composing music.
  - Language Models:
    - Understanding and generating human language, such as in chatbots and translation services.
- Core Components:
  - Models:
    - Built using neural networks, trained to generate output resembling the input data.
  - Tokenization:
    - Converts human text into vectors called token IDs, which represent words or units in the model's vocabulary.
  - Embeddings:
    - Numerical vector representations of tokens, capturing semantic meaning and context.
  - Self-Attention Mechanism:
    - Computes query, key, and value vectors for each token to determine attention weights.
  - Positional Encoding:
    - Encodes the relative position of each token to maintain the structure and order of sentences.
- In-Context Learning:
  - Few-Shot Learning:
    - Provides a few examples within a prompt to guide the model in generating better outputs.
    - Example: Showing the model a few translated sentences to improve its translation capability.
  - Zero-Shot Learning:
    - The model performs a task it hasn't been explicitly trained for, without examples.
    - Example: Asking a model to generate a summary without providing any prior examples.
  - One-Shot Learning:
    - Provides only one example to learn from.
    - Example: Teaching the model to classify a rare object with a single labeled example.
Guidelines for Responsible AI
- Development of Responsible AI Systems:
  - Ensuring AI systems are ethical, transparent, and fair.
  - Principles:
    - Fairness: AI should be unbiased and treat all individuals equally.
    - Transparency: The workings of AI models should be understandable.
    - Robustness: AI systems should be resilient and handle unexpected situations gracefully.
    - Privacy and Security: Protecting user data and ensuring compliance with privacy regulations.
- Transparent and Explainable Models:
  - Importance of creating AI models that are interpretable and explainable.
  - Techniques for Explainability:
    - LIME (Local Interpretable Model-agnostic Explanations): Provides local explanations for individual predictions.
    - SHAP (SHapley Additive exPlanations): Calculates the contribution of each feature to the model's prediction.
    - Integrated Gradients: Attributes the prediction of a model to its input features by computing gradients.
Security, Compliance, and Governance for AI Solutions
- Methods to Secure AI Systems:
  - Shared Responsibility Model:
    - AWS Responsibilities: Infrastructure security, service management.
    - Customer Responsibilities: Service configuration, application security.
  - Identity and Access Management (IAM):
    - IAM Users: Represent individuals needing access to AWS services.
    - IAM Groups: Collections of users with similar permissions.
    - IAM Roles: Temporary access permissions for AWS resources.
    - Principle of Least Privilege: Grant minimal permissions necessary.
  - Data Encryption:
    - Data at Rest: Encryption of stored data.
    - Data in Transit: Encryption during data transfer.
    - AWS Key Management Service (KMS): Management of encryption keys.
  - Logging and Monitoring:
    - AWS CloudTrail: Captures and logs API calls.
    - Amazon SageMaker Role Manager: Simplifies the creation of IAM roles for ML tasks.
- Governance and Compliance Regulations for AI Systems
  - AWS Compliance Tools:
    - AWS Audit Manager: Automates compliance audits and evidence collection.
    - AWS Config: Monitors resource configurations and evaluates compliance.
    - Amazon Inspector: Provides automated security assessments.
    - AWS Trusted Advisor: Offers guidance on security best practices.
Types of Machine Learning Problems
- Supervised Learning:
  - Definition: The model is trained on a labeled dataset, where each training example is paired with an output label.
  - Types of Supervised Learning:
    - Classification:
      - Binary Classification: Categorizes data into two classes (e.g., spam vs. not spam emails).
      - Multiclass Classification: Categorizes data into more than two classes (e.g., categorizing news articles into sports, finance).
    - Regression:
      - Linear Regression: Predicts a continuous output with a linear relationship between input and output.
      - Multiple Linear Regression: Uses multiple input variables to predict the output.
      - Logistic Regression: Used for binary classification tasks, predicting the probability of an event occurring.
- Unsupervised Learning:
  - Definition: The model is given data without explicit instructions on what to do with it, identifying underlying patterns or structures.
  - Clustering:
    - K-Means Clustering: Divides data into a predefined number of clusters based on similarity.
    - Hierarchical Clustering: Builds a tree of clusters based on data similarity.
  - Anomaly Detection: Identifies rare items or events that do not conform to expected patterns (e.g., fraud detection).
- Semi-Supervised Learning:
  - Definition: A blend of supervised and unsupervised learning, where the model is trained on a small amount of labeled data and a larger amount of unlabeled data.
- Reinforcement Learning:
  - Definition: An agent learns to make decisions by performing actions in an environment to maximize cumulative rewards.
  - Examples: Game playing, robotics.
Model Deployment
- Batch vs. Real-Time Inference:
  - Batch Inference:
    - Ideal for large numbers of inferences where results can be delayed (e.g., overnight processing).
    - Cost-effective as resources are used intermittently.
  - Real-Time Inference:
    - Suitable for immediate responses to client requests, often via a REST API.
    - Deployed models respond immediately, ideal for applications like chatbots.
- Deployment Options:
  - AWS API Gateway & Lambda:
    - API Gateway: Handles client interactions and passes requests to Lambda running the model.
  - Docker Containers:
    - Used for deploying models, offering versatility across AWS services (ECS, EKS, Lambda, EC2).
  - Amazon SageMaker:
    - Provides managed endpoints for various inference types (batch, asynchronous, serverless, real-time).
    - Simplifies deployment by managing infrastructure, scalability, and updates.
Model Monitoring
- Performance Degradation:
  - Over time, model performance may degrade due to factors like data quality, model quality, and bias.
  - Mitigation Strategies:
    - Retraining models with new data, adjusting algorithms, or updating features.
- Monitoring Systems:
  - Data & Concept Drift:
    - Detects significant changes in data distribution (data drift) and changes in target variable properties (concept drift).
  - Amazon SageMaker Model Monitor:
    - Monitors models in production, detects errors, and compares data against a baseline.
    - Sends alerts via CloudWatch, potentially triggering re-training cycles.
- Automation & MLOps:
  - MLOps:
    - Incorporates DevOps practices into ML model development, focusing on automating tasks, ensuring version control, and monitoring deployments.
    - Improves productivity, repeatability, reliability, compliance, and data quality.
  - Amazon SageMaker Pipelines:
    - Facilitates the orchestration of ML pipelines, enabling the deployment of models and tracking lineage.
Model Evaluation Metrics
- Classification Metrics:
  - Confusion Matrix:
    - True Positive (TP): Correctly predicted positive cases.
    - True Negative (TN): Correctly predicted negative cases.
    - False Positive (FP): Incorrectly predicted positive cases.
    - False Negative (FN): Incorrectly predicted negative cases.
  - Accuracy: Measures the percentage of correct predictions. Suitable for balanced datasets.
  - Precision: Focuses on the accuracy of positive predictions. Important when minimizing false positives.
  - Recall: Measures the ability to detect all actual positives. Used when minimizing false negatives is critical.
  - F1 Score: Balances precision and recall. Ideal when both metrics are important.
  - AUC-ROC: Evaluates binary classification models by plotting true positive rate against false positive rate across thresholds.
- Regression Metrics:
  - Mean Squared Error (MSE): Average of squared differences between predictions and actual values. Sensitive to outliers.
  - Root Mean Squared Error (RMSE): Square root of MSE, easier to interpret as it's in the same units as the dependent variable.
  - Mean Absolute Error (MAE): Average of absolute errors, less sensitive to outliers than MSE.
- Business Metrics:
  - Return on Investment (ROI): Measures the profitability of an investment.
  - Cost Reduction: Quantifies the savings achieved through AI solutions.
  - Increased Sales: Evaluates the impact of AI solutions on revenue growth.
  - AWS Cost Explorer with Cost Allocation Tags: Monitors project expenses.
- Generative AI Metrics:
  - ROUGE (Recall-Oriented Understudy for Gisting Evaluation): Measures the quality of summarization and translation by comparing generated text to reference text.
  - BLEU (Bilingual Evaluation Understudy): Evaluates machine translation by comparing the model's translations to human translations.
  - GLUE (General Language Understanding Evaluation): A benchmark that tests various language understanding tasks like sentiment analysis.
  - SuperGlue: Extends GLUE by adding tasks that require complex reasoning and understanding, like reading comprehension.
  - MMLU (Massive Multitask Language Understanding): Tests a model's knowledge and problem-solving skills across diverse topics, from history to mathematics.
  - BIG-bench: Challenges models with tasks beyond current capabilities, such as advanced reasoning and specialized knowledge.
  - HELM (Holistic Evaluation of Language Models): Focuses on improving model transparency and evaluates performance on tasks like summarization and sentiment analysis.
AWS AI Services
- Computer Vision Services:
  - Amazon Rekognition:
    - Deep learning service for computer vision tasks.
    - Use Cases: Face recognition, object detection, content moderation, real-time video analysis.
  - Amazon Textract:
    - Extracts text, handwriting, forms, and tables from scanned documents.
    - Use Cases: Automating document processing (e.g., invoices, forms).
- Natural Language Processing (NLP) Services:
  - Amazon Comprehend:
    - NLP service that discovers insights and relationships in text.
    - Use Cases: Sentiment analysis, PII detection, entity recognition.
  - Amazon Lex:
    - Builds voice and text interfaces using Amazon Alexa technology.
    - Use Cases: Chatbots, interactive voice response systems for customer service.
  - Amazon Polly:
    - Converts text into natural-sounding speech in multiple languages.
    - Use Cases: Text-to-speech conversion for audio content, enhancing accessibility and engagement.
  - Amazon Kendra:
    - ML-powered search service for enterprise systems.
    - Use Cases: Intelligent search with natural language queries.
  - Amazon Transcribe:
    - Converts spoken language into text (speech-to-text).
    - Use Cases: Real-time transcription, captioning for live or recorded audio/video.
- Personalization & Recommendation Services:
  - Amazon Personalize:
    - Provides personalized recommendations for customers.
    - Use Cases: Product/content recommendations, targeted marketing campaigns.
- Translation Services:
  - Amazon Translate:
    - Neural machine translation for text across 75 languages.
    - Use Cases: Real-time translation in chat applications, multilingual content creation.
- Forecasting & Planning Services:
  - Amazon Forecast:
    - AI service for time series forecasting.
    - Use Cases: Demand forecasting, inventory management, financial planning.
- Fraud Detection Services:
  - Amazon Fraud Detector:
    - Detects potentially fraudulent online activities using pre-trained models.
    - Use Cases: Preventing online payment fraud, detecting fake accounts, account takeover prevention.
- Generative AI Services:
  - Amazon Bedrock:
    - Service to build generative AI applications using foundation models from top AI providers.
    - Use Cases: Content creation, image generation, retrieval augmented generation (RAG) for enhanced model accuracy.
- Custom ML Development:
  - Amazon SageMaker:
    - Comprehensive service for building, training, and deploying custom ML models.
    - Use Cases: Custom model development for predictive analytics, large-scale data processing, real-time inference.
Amazon SageMaker Services
- SageMaker Ground Truth:
  - A data labeling service to build highly accurate training datasets for machine learning quickly.
  - Features: Human-in-the-loop labeling, integration with other AWS services, automated data labeling using machine learning models.
- SageMaker Canvas:
  - Enables business analysts to build machine learning models and generate accurate predictions without writing code.
  - Features: No-code interface, automated model generation, supports structured data.
- SageMaker Experiments:
  - A tool to organize, track, compare, and evaluate machine learning experiments.
  - Features: Experiment tracking, lineage tracking, comparison of experiment results, integration with SageMaker Studio.
- SageMaker Model Monitor:
  - Monitors deployed models in production for data and model quality issues and automatically detects and alerts on potential problems.
  - Features: Real-time monitoring, alerting via CloudWatch, integration with SageMaker Studio for visualization, supports custom rules and built-in monitors.
- SageMaker Pipelines:
  - A service to build, automate, and manage end-to-end machine learning workflows.
  - Features: Workflow orchestration, model deployment, lineage tracking, integration with SageMaker Studio, Python SDK, JSON-based pipeline definition, supports conditional logic.
- SageMaker Model Registry:
  - A centralized repository to store, version, and manage machine learning models.
  - Features: Model versioning, model lineage tracking, integration with deployment pipelines, support for multiple model versions.
- SageMaker Feature Store:
  - A purpose-built repository for storing, retrieving, and sharing machine learning features.
  - Features: Feature definition storage, real-time and offline retrieval, integration with SageMaker Pipelines and SageMaker Studio, versioning of feature definitions.
- SageMaker Inference Recommender:
  - Helps select the best compute instance and configuration for inference workloads by running benchmark tests on different configurations.
  - Features: Instance type recommendation, configuration testing, support for different inference options, integration with SageMaker deployment.
- SageMaker Serverless Inference:
  - Allows serving machine learning models without managing infrastructure, automatically scaling based on traffic patterns.
  - Features: No need for provisioning instances, automatic scaling, cost-effective for intermittent workloads, leverages AWS Lambda.
- SageMaker Real-Time Inference:
  - Provides persistent endpoints for real-time inference that are fully managed and can automatically scale.
  - Features: Low-latency real-time responses, persistent endpoints, support for auto-scaling, integration with other AWS services like API Gateway.
- SageMaker Batch Transform:
  - A service for offline inference that processes large datasets in batches.
  - Features: Suitable for large datasets, supports gigabyte-scale data, no need for persistent endpoints, integration with S3 for input/output data.
- SageMaker Asynchronous Inference:
  - Supports workloads that involve large payloads or have long inference processing times, decoupling request and response so clients don't have to wait for the inference response.
  - Features: Asynchronous response handling, decoupling request and response, support for large payloads, storage of results in S3, cost-effective for long-running or large-payload inferences.
Foundational Models
- Selection Criteria for Pre-trained Models:
  - Cost: Consider the expense of training the model, including hardware, storage, and computational resources.
  - Latency Constraints: For real-time applications, the model must provide rapid responses.
  - Modalities Supported: Models may handle different types of data (text, image, etc.) and may require ensemble methods to improve performance.
  - Architecture and Complexity: More complex models may offer higher accuracy but require more computational resources.
  - Performance Metrics: Evaluate models using metrics like accuracy, precision, recall, F1 score, RMSE, MAP, MAE.
- Biases in Training Data:
  - Bias Mitigation: Address biases present in training data to ensure ethical and fair outcomes.
  - Ethical Considerations: Make informed decisions about model selection and fine-tuning with a focus on minimizing biases.
- Availability and Compatibility:
  - Model Repositories: Check if the model is available on platforms like TensorFlow Hub, PyTorch Hub, Hugging Face.
  - Compatibility: Ensure the model aligns with your framework, language, and environment.
- Customization and Explainability:
  - Customization Techniques:
    - Model Fine-Tuning: Adjusting a pre-trained model on new data to improve task-specific performance.
    - Transfer Learning: Adapting a pre-trained model to a new but related task.
    - Meta Learning: Models learn to adapt to new tasks quickly.
    - Self-Supervised Learning: Models learn to predict parts of their input data, creating labeled data from raw data.
  - Explainability Tools:
    - LIME, SHAP, Integrated Gradients: Techniques for interpreting model predictions.
- Inference Parameters:
  - Temperature: Controls the randomness of responses. Higher values increase diversity, lower values make the output more focused and deterministic.
  - Top K: Limits the number of top predictions considered during generation, reducing randomness.
  - Top P (Nucleus Sampling): Uses cumulative probability to determine the response space, dynamically choosing the set of likely next words.
  - Response Length: Sets limits on the length of model outputs to prevent overly long or short responses.
  - Penalties: Adjusts the model's tendency to repeat the same output (repetition penalty) or to continue a thought (presence penalty).
- Evaluation Metrics for Generative AI:
  - ROUGE: Evaluates the quality of text summarization.
  - BLEU: Measures the accuracy of machine translation.
  - GLUE: Benchmarks for general language understanding.
  - SuperGlue: Extends GLUE with more challenging language understanding tasks.
  - MMLU: Tests broad knowledge and problem-solving skills.
  - BIG-bench: Evaluates models on tasks that are beyond current capabilities.
  - HELM: Focuses on transparency and bias detection in AI outputs.
Prompt Engineering Techniques
- Introduction to Prompts:
  - Definition and components of a prompt.
- Prompting Techniques:
  - Few-Shot Prompting: Providing a few examples to guide the model.
    - Example: Translate the following sentences into French.
  - Zero-Shot Prompting: Asking the model to perform a task without examples.
    - Example: Translate "good morning" to Spanish.
  - One-Shot Prompting: Providing a single example.
    - Example: Show how to solve a single math problem to guide the model.
  - Chain-of-Thought Prompting: Breaking down complex tasks into intermediate steps to improve coherence.
    - Example: Provide step-by-step reasoning for a scientific explanation.
  - Prompt Tuning: Using continuous embeddings optimized during training to improve model outputs.
- Best Practices:
  - Be Specific: Define clear instructions and examples.
  - Include Examples: Guide the model with sample inputs and outputs.
  - Experiment and Iterate: Test and refine prompts to enhance model performance.
  - Use Multiple Comments: Provide context without cluttering the prompt.
  - Add Guardrails: Implement safety measures to manage AI interactions.
- Risks and Limitations:
  - Prompt Injection: Manipulating prompts to produce unintended outputs.
  - Jailbreaking: Bypassing safety mechanisms set by prompt engineers.
  - Hijacking: Changing the original prompt with new instructions.
  - Poisoning: Embedding harmful instructions in various inputs.
Vector Databases and Retrieval Augmented Generation (RAG)
- Vector Databases:
  - Function: Store data as numerical vectors for efficient lookups and enhance model capabilities by providing relevant data.
  - AWS Services for Vector Search:
    - Amazon OpenSearch Service, Amazon Aurora, Redis, Amazon Neptune, Amazon DocumentDB, Amazon RDS with PostgreSQL.
- Retrieval Augmented Generation (RAG):
  - Components:
    - Retriever: Searches knowledge base for relevant data.
    - Generator: Produces outputs based on the retrieved data.
  - Applications:
    - Question Answering: Enhances model responses by integrating external knowledge.
    - Content Generation: Uses external data to improve content accuracy.
Overview of Responsible AI
- Core Dimensions:
  - Fairness: Ensures equitable treatment across diverse groups.
  - Explainability: Provides clear reasons for AI decisions.
  - Robustness: Ensures tolerance to failures and minimizes errors.
  - Privacy: Protects user data and ensures PII is not exposed.
  - Governance: Meets compliance and risk management standards.
  - Transparency: Clearly communicates model capabilities and risks.
Methods to Secure AI Systems
- Shared Responsibility Model:
  - AWS Responsibilities: Security of the cloud infrastructure.
  - Customer Responsibilities: Security within the cloud.
- IAM (Identity and Access Management):
  - Purpose: Manages access to AWS resources, including user creation, permissions, and MFA.
  - Root User: Initial account with unrestricted access; best practices include minimizing usage and enabling MFA.
  - IAM Users and Groups: Best practices for managing user access.
  - IAM Roles: Reducing risk by providing temporary access.
- Data Encryption:
  - Types: Data at Rest and Data in Transit.
  - AWS KMS (Key Management Service): Manage and control encryption keys.
- S3 Block Public Access: Prevents public access to S3 buckets and objects.
- SageMaker Role Manager: Simplifies role creation for SageMaker tasks.
Compliance Tools and AWS Services
- AWS Audit Manager: Maps compliance requirements to AWS usage data and produces assessment reports.
- AWS Config: Monitors resource configurations and compliance.
- Amazon Inspector: Assesses security vulnerabilities in applications and containers.
- AWS Trusted Advisor: Provides recommendations for cost optimization, performance, security, and operational excellence.
- AWS Glue DataBrew: Visual data preparation and quality management tools.
- AWS Glue Data Quality: Sets data quality rules and detects anomalies.

pablo-albaladejo/aws_ai_practitioner.md