Notebooks
N
Neode
04 AI Triple Generation

04 AI Triple Generation

neode-notebookstutorials

πŸ€– AI-Powered Triple Generation

Open in Alph

This cookbook covers how to automatically generate knowledge graph triples from unstructured text using Neode's AI capabilities.

What You'll Learn

  • Generating triples from text automatically
  • Processing articles and documents
  • Building extraction pipelines
[ ]

Basic Triple Generation

Pass any text context to the AI and it will extract structured triples:

[3]
πŸ“ Input Text:
Albert Einstein was a German-born theoretical physicist who developed the theory of relativity.
He was born in Ulm, Germany in 1879. Einstein won the Nobel Prize in Physics in 1921 
for his discovery of the law of the photoelectric effect. He later became a US citizen 
and worked at Princeton University until his death in 1955.

πŸ”— Generated Triples:
------------------------------------------------------------
  Albert Einstein β†’ instance_of β†’ theoretical physicist
  Albert Einstein β†’ born_on β†’ 1879-03-14
  Albert Einstein β†’ born_in β†’ Ulm, WΓΌrttemberg, Germany
  Albert Einstein β†’ died_on β†’ 1955-04-18
  Albert Einstein β†’ died_in β†’ Princeton, New Jersey, United States
  Albert Einstein β†’ developed β†’ special theory of relativity
  Albert Einstein β†’ developed β†’ general theory of relativity
  Albert Einstein β†’ awarded β†’ Nobel Prize in Physics
  Albert Einstein β†’ Nobel_Prize_in_Physics_award_year β†’ 1921
  Nobel Prize in Physics 1921 β†’ prize_motivation β†’ discovery of the law of the photoelectric effect

Processing News Articles

[3]
πŸ“° Article Triples:
------------------------------------------------------------
  GPT-4 β†’ announced_on β†’ 2023-03-14
  GPT-4 β†’ developed_by β†’ OpenAI
  GPT-4 β†’ is_a β†’ large multimodal model
  GPT-4 β†’ accepts_input_modality β†’ text
  GPT-4 β†’ accepts_input_modality β†’ images
  GPT-4 β†’ emits_output_modality β†’ text
  OpenAI β†’ headquartered_in β†’ San Francisco, California, United States
  Sam Altman β†’ ceo_of β†’ OpenAI
  GPT-4 β†’ integrated_into β†’ Bing Chat
  GPT-4 β†’ available_in β†’ ChatGPT

Generate and Store Triples

Generate triples and immediately add them to your knowledge graph:

[4]
βœ… Extracted and stored 10 triples!

Processing Multiple Documents

[5]
πŸ“„ Document 1: Generated 10 triples
πŸ“„ Document 2: Generated 10 triples
πŸ“„ Document 3: Generated 10 triples

βœ… Total triples generated: 30

πŸ”— All Generated Triples:
------------------------------------------------------------
  SpaceX β†’ founded_by β†’ Elon Musk
  SpaceX β†’ incorporated_on β†’ 2002-03-14
  SpaceX β†’ operates_facility_in β†’ Hawthorne, California
  SpaceX β†’ headquartered_in β†’ Starbase, Texas
  Falcon 9 β†’ designed_and_manufactured_by β†’ SpaceX
  Falcon 9 β†’ first_launched_on β†’ 2010-06-04
  Dragon (SpaceX spacecraft family) β†’ developed_by β†’ SpaceX
  Crew Dragon Demo-2 β†’ launched_on β†’ 2020-05-30
  SpaceX β†’ sent_astronauts_to β†’ International Space Station
  SpaceX Hawthorne facility β†’ has_address β†’ 1 Rocket Road, Hawthorne, California
  Tesla, Inc. β†’ incorporated_on β†’ 2003-07-01
  Tesla Motors, Inc. β†’ incorporated_by β†’ Martin Eberhard
  Tesla Motors, Inc. β†’ incorporated_by β†’ Marc Tarpenning
  Elon Musk β†’ became_chairman_of β†’ Tesla, Inc.
  Elon Musk β†’ became_chairman_in β†’ 2004-02
  Elon Musk β†’ named_chief_executive_officer_of β†’ Tesla, Inc.
  Elon Musk β†’ became_ceo_in β†’ 2008
  Tesla, Inc. β†’ headquartered_in β†’ Austin, Texas
  Tesla, Inc. β†’ manufactures_vehicle_model β†’ Tesla Model S
  Tesla, Inc. β†’ manufactures_vehicle_model β†’ Tesla Model 3
  Neuralink β†’ is_a β†’ American neurotechnology company
  Neuralink β†’ founded_on β†’ 2016-06-21
  Elon Musk β†’ cofounded β†’ Neuralink
  Neuralink β†’ headquarters_location β†’ Fremont, California, United States
  Neuralink β†’ develops β†’ implantable brain-computer interfaces
  Jared Birchall β†’ chief_executive_officer_of β†’ Neuralink
  Neuralink β†’ website β†’ https://neuralink.com/
  Neuralink β†’ headquarters_address β†’ 7400 Paseo Padre Pkwy, Fremont, CA 94555
  Neuralink β†’ received_FDA_approval_for_human_clinical_trials_in β†’ May 2023
  Neuralink β†’ began_first_human_trials_in β†’ September 2023

Building a Document Processing Pipeline

[6]
πŸ”„ Processing documents...
  Processed 'python_wiki': 10 triples
  Processed 'js_wiki': 10 triples
  Processed 'rust_wiki': 10 triples

βœ… Total triples extracted: 30

πŸ’Ύ Saved 30 triples to knowledge graph

Best Practices for Triple Generation

1. Chunk Large Documents

[7]
Split into 8 chunks

Chunk 1: 10 triples

Chunk 2: 10 triples

2. Deduplicate Generated Triples

[8]
Original: 3 triples
After deduplication: 2 triples

Next Steps

  • 05_graphs_and_entities.ipynb - Organize knowledge into separate graphs and manage entities