Last active
June 30, 2025 12:56
-
-
Save CliffordAnderson/952fe8e654f7437f5849c3241249290e to your computer and use it in GitHub Desktop.
using-pretrained-abstract-to-tweet-model.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"private_outputs": true, | |
"provenance": [], | |
"authorship_tag": "ABX9TyOnh4txOUwml1PTmGksSne7", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/CliffordAnderson/952fe8e654f7437f5849c3241249290e/using-pretrained-abstract-to-tweet-model.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "notebook-title" | |
}, | |
"source": [ | |
"# Abstract-to-Tweet Generator\n", | |
"\n", | |
"This notebook demonstrates how to use a pre-trained transformer model to convert academic abstracts into tweet-sized summaries. The model has been fine-tuned specifically for this task, making it ideal for researchers who want to share their work on social media.\n", | |
"\n", | |
"## Overview\n", | |
"- **Model**: `andersoncliffb/abstracts_to_tweet_model`\n", | |
"- **Task**: Text-to-text generation (sequence-to-sequence)\n", | |
"- **Purpose**: Convert long academic abstracts into concise, Twitter-friendly summaries\n", | |
"- **Framework**: Hugging Face Transformers\n", | |
"\n", | |
"## What You'll Learn\n", | |
"1. How to load a pre-trained sequence-to-sequence model\n", | |
"2. How to set up a text generation pipeline\n", | |
"3. How to generate tweet-length summaries from academic text\n", | |
"4. Best practices for using the model with different types of abstracts" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "setup-section" | |
}, | |
"source": [ | |
"## 1. Setup and Dependencies\n", | |
"\n", | |
"First, we'll import the necessary libraries from the Hugging Face Transformers library. This section loads the core components needed for text generation." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"id": "b_FPaSQce9V-" | |
}, | |
"outputs": [], | |
"source": [ | |
"# Import required libraries from Hugging Face Transformers\n", | |
"from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline\n", | |
"\n", | |
"print(\"β Dependencies imported successfully!\")\n", | |
"print(\"Ready to load the abstract-to-tweet model...\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "model-loading-section" | |
}, | |
"source": [ | |
"## 2. Model and Tokenizer Loading\n", | |
"\n", | |
"Here we load the pre-trained model and tokenizer. This model has been specifically fine-tuned to convert academic abstracts into tweet-length summaries.\n", | |
"\n", | |
"**Key Components:**\n", | |
"- **Tokenizer**: Converts text into tokens that the model can understand\n", | |
"- **Model**: The neural network that performs the text-to-text generation\n", | |
"- **Pipeline**: A high-level interface that combines tokenizer and model for easy use" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# Load the tokenizer for the abstract-to-tweet model\n", | |
"# The tokenizer handles text preprocessing and postprocessing\n", | |
"print(\"Loading tokenizer...\")\n", | |
"tokenizer = AutoTokenizer.from_pretrained(\n", | |
" 'andersoncliffb/abstracts_to_tweet_model', \n", | |
" revision=None\n", | |
")\n", | |
"\n", | |
"# Load the pre-trained sequence-to-sequence model\n", | |
"# This model has been fine-tuned specifically for abstract-to-tweet conversion\n", | |
"print(\"Loading model...\")\n", | |
"model = AutoModelForSeq2SeqLM.from_pretrained(\n", | |
" 'andersoncliffb/abstracts_to_tweet_model', \n", | |
" revision=None\n", | |
")\n", | |
"\n", | |
"# Create a text generation pipeline\n", | |
"# This combines the model and tokenizer into an easy-to-use interface\n", | |
"print(\"Creating pipeline...\")\n", | |
"pipe = pipeline(\n", | |
" 'text2text-generation',\n", | |
" model=model,\n", | |
" tokenizer=tokenizer,\n", | |
" pad_token_id=tokenizer.pad_token_id # Ensures proper padding for batch processing\n", | |
")\n", | |
"\n", | |
"print(\"β Model, tokenizer, and pipeline loaded successfully!\")\n", | |
"print(f\"Model name: {model.config.name_or_path}\")\n", | |
"print(f\"Tokenizer vocab size: {len(tokenizer)}\")" | |
], | |
"metadata": { | |
"id": "model-loading-cell" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "input-preparation-section" | |
}, | |
"source": [ | |
"## 3. Input Preparation\n", | |
"\n", | |
"Now we'll prepare our input text. In this example, we're using a sample academic abstract about cross-lingual machine reading comprehension. \n", | |
"\n", | |
"**Note**: The model works best with:\n", | |
"- Academic abstracts (100-300 words)\n", | |
"- Clear, structured text\n", | |
"- Complete sentences and proper grammar" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# Define the input abstract for conversion to tweet\n", | |
"# This is a sample academic abstract about machine learning and NLP\n", | |
"sample_abstract = \"\"\"\n", | |
"In this paper, we introduce a novel learning framework for addressing inconsistencies and incompleteness in \n", | |
"Effects of Pre-training Task Structure on Cross-lingual Transfer of large-scale, multilingual machine reading \n", | |
"comprehension (MRC) models. Our proposed method, termed Structured-MRC, employs a new task structure that \n", | |
"strategically balances knowledge transfer and specialized information acquisition across languages. Rather than \n", | |
"using one universal pre-training task, Structured-MRC synchronizes task-wise pre-training across related language \n", | |
"pairs. This technique allows our models to effectively learn and transfer recurring patterns while avoiding \n", | |
"overgeneralization. Comprehensive experiments are carried out on eight diverse languages from the XNLI, XNLG, \n", | |
"MARC, and WikiMRC datasets, demonstrating that the Structured-MRC framework significantly outperforms \n", | |
"state-of-the-art approaches in terms of consistency, comprehensibility, and generality. The insights gained \n", | |
"from this study highlight the importance of structuring learning tasks for cross-lingual transfer in MRC, \n", | |
"with implications for various NLP applications.\n", | |
"\"\"\".strip()\n", | |
"\n", | |
"# Prepare input as a list (the pipeline expects a list of strings)\n", | |
"inputs = [sample_abstract]\n", | |
"\n", | |
"print(\"π Input abstract prepared:\")\n", | |
"print(f\"Length: {len(sample_abstract)} characters\")\n", | |
"print(f\"Word count: ~{len(sample_abstract.split())} words\")\n", | |
"print(\"\\n\" + \"=\"*50)\n", | |
"print(sample_abstract)\n", | |
"print(\"=\"*50)" | |
], | |
"metadata": { | |
"id": "input-preparation-cell" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "generation-section" | |
}, | |
"source": [ | |
"## 4. Tweet Generation\n", | |
"\n", | |
"Now we'll use the pipeline to generate a tweet-length summary of our abstract. \n", | |
"\n", | |
"**Generation Parameters:**\n", | |
"- `max_length=512`: Maximum number of tokens in the output\n", | |
"- `do_sample=False`: Use deterministic generation (greedy decoding) for consistent results\n", | |
"- You can experiment with `do_sample=True` and `temperature` for more creative outputs" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# Generate tweet-length summary from the abstract\n", | |
"print(\"π Generating tweet summary...\")\n", | |
"print(\"This may take a few moments...\")\n", | |
"\n", | |
"# Run the text generation pipeline\n", | |
"result = pipe(\n", | |
" inputs,\n", | |
" max_length=512, # Maximum length of generated text\n", | |
" do_sample=False, # Use deterministic generation\n", | |
" # Alternative parameters you can experiment with:\n", | |
" # do_sample=True, # Enable sampling for more creative outputs\n", | |
" # temperature=0.8, # Control randomness (0.0 = deterministic, 1.0 = very random)\n", | |
" # top_p=0.9, # Nucleus sampling parameter\n", | |
" # num_return_sequences=3 # Generate multiple alternatives\n", | |
")\n", | |
"\n", | |
"# Extract the generated text\n", | |
"generated_tweet = result[0][\"generated_text\"]\n", | |
"\n", | |
"print(\"β Tweet generation complete!\")" | |
], | |
"metadata": { | |
"id": "wPGtv6N1fxJM" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "results-section" | |
}, | |
"source": [ | |
"## 5. Results and Analysis\n", | |
"\n", | |
"Let's examine the generated tweet and compare it with the original abstract." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# Display the results with formatting\n", | |
"print(\"π¦ GENERATED TWEET:\")\n", | |
"print(\"=\"*60)\n", | |
"print(generated_tweet)\n", | |
"print(\"=\"*60)\n", | |
"\n", | |
"# Analyze the output\n", | |
"tweet_length = len(generated_tweet)\n", | |
"word_count = len(generated_tweet.split())\n", | |
"compression_ratio = len(sample_abstract) / tweet_length if tweet_length > 0 else 0\n", | |
"\n", | |
"print(f\"\\nπ ANALYSIS:\")\n", | |
"print(f\"β’ Tweet length: {tweet_length} characters\")\n", | |
"print(f\"β’ Word count: {word_count} words\")\n", | |
"print(f\"β’ Compression ratio: {compression_ratio:.1f}x smaller than original\")\n", | |
"print(f\"β’ Twitter character limit: {280 - tweet_length} characters remaining\")\n", | |
"\n", | |
"# Check if it fits Twitter's character limit\n", | |
"if tweet_length <= 280:\n", | |
" print(\"β Perfect! Fits within Twitter's 280-character limit\")\n", | |
"else:\n", | |
" print(\"β οΈ Warning: Exceeds Twitter's 280-character limit\")\n", | |
" print(\" Consider truncating or regenerating with shorter max_length\")" | |
], | |
"metadata": { | |
"id": "results-analysis-cell" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "experimentation-section" | |
}, | |
"source": [ | |
"## 6. Experimentation Section\n", | |
"\n", | |
"Try the model with your own abstracts! Replace the text in the cell below with your own academic abstract." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# π§ͺ EXPERIMENT WITH YOUR OWN ABSTRACT\n", | |
"# Replace the text below with your own academic abstract\n", | |
"\n", | |
"your_abstract = \"\"\"\n", | |
"Paste your academic abstract here...\n", | |
"\"\"\".strip()\n", | |
"\n", | |
"# Uncomment and run the lines below to test with your abstract\n", | |
"# if len(your_abstract) > 20: # Basic check to ensure there's actual content\n", | |
"# print(\"Testing with your abstract...\")\n", | |
"# your_result = pipe([your_abstract], max_length=512, do_sample=False)\n", | |
"# your_tweet = your_result[0][\"generated_text\"]\n", | |
"# \n", | |
"# print(\"\\nπ¦ YOUR GENERATED TWEET:\")\n", | |
"# print(\"=\"*60)\n", | |
"# print(your_tweet)\n", | |
"# print(\"=\"*60)\n", | |
"# print(f\"Length: {len(your_tweet)} characters\")\n", | |
"# else:\n", | |
"# print(\"Please add your abstract text above and uncomment the code!\")\n", | |
"\n", | |
"print(\"Ready for your experimentation! π\")\n", | |
"print(\"Replace the placeholder text above and uncomment the code to test your own abstract.\")" | |
], | |
"metadata": { | |
"id": "experimentation-cell" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "tips-section" | |
}, | |
"source": [ | |
"## 7. Tips for Better Results\n", | |
"\n", | |
"### Input Quality Tips:\n", | |
"- **Use complete abstracts**: The model works best with full, well-structured abstracts\n", | |
"- **Optimal length**: 100-300 word abstracts tend to produce the best tweets\n", | |
"- **Clear language**: Avoid excessive jargon or unclear abbreviations\n", | |
"\n", | |
"### Generation Parameters to Experiment With:\n", | |
"```python\n", | |
"# For more creative outputs:\n", | |
"result = pipe(inputs, max_length=280, do_sample=True, temperature=0.8, top_p=0.9)\n", | |
"\n", | |
"# For multiple alternatives:\n", | |
"result = pipe(inputs, max_length=280, num_return_sequences=3, do_sample=True)\n", | |
"\n", | |
"# For shorter outputs:\n", | |
"result = pipe(inputs, max_length=150, do_sample=False)\n", | |
"```\n", | |
"\n", | |
"### Post-Processing Ideas:\n", | |
"- Add relevant hashtags manually\n", | |
"- Include author mentions or handles\n", | |
"- Add emojis for engagement\n", | |
"- Include links to the full paper" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "conclusion-section" | |
}, | |
"source": [ | |
"## 8. Conclusion\n", | |
"\n", | |
"You've successfully used a pre-trained transformer model to convert academic abstracts into tweet-length summaries! This tool can be incredibly useful for:\n", | |
"\n", | |
"- **Researchers** sharing their work on social media\n", | |
"- **Science communicators** making research more accessible\n", | |
"- **Academic institutions** promoting faculty publications\n", | |
"- **Conference organizers** creating social media content\n", | |
"\n", | |
"### Next Steps:\n", | |
"1. Try the model with different types of abstracts\n", | |
"2. Experiment with different generation parameters\n", | |
"3. Consider fine-tuning the model for your specific domain\n", | |
"4. Integrate this into a larger content creation workflow\n", | |
"\n", | |
"### Model Information:\n", | |
"- **Model**: `andersoncliffb/abstracts_to_tweet_model`\n", | |
"- **Based on**: Transformer seq2seq architecture\n", | |
"- **Training data**: Academic abstracts paired with tweet summaries\n", | |
"- **License**: Check the model card on Hugging Face for licensing details\n", | |
"\n", | |
"Happy tweeting! π¦β¨" | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment