Generate from documentation

Create Rainbird knowledge graphs from prompts and business documentation such as policies and regulations, accelerating the build process and reducing manual effort.

All features delivered by Rainbird Labs are beta. They may contain bugs, are subject to change and are not covered by our platform SLAs. Your feedback can help shape development.

Getting Started

Within the Rainbird Studio, select Generate from Docs, provide a name for your graph, explain the graph you would like to be built and upload relevant documentation.

If you need support with writing a detailed prompt that articulates the details and nuances of your domain then consider using Consult. This uses knowledge elicitation techniques to extract tacit knowledge and information that may not be available in documentation alone.

Usage guidelines

Support formats include TXT, PDF & CSV.
Only text is processed, images will be removed before processing.
Document size may impact the generated knowledge graph.
- Document size restrictions is an approximation, but the total of the prompt and any documentation should be less than 50,000 words (approximately 90 pages of text).
Documentation structure and information may impact the generated knowledge graph. Please review documentation best practice below to optimise your documentation.

Documentation Best Practice

To improve the accuracy of the knowledge graph produced, please consider some of these tips:

Ensure the document is outcome-focused: Make sure your document is strictly focused on achieving a desired outcome. Remove any unnecessary introductions or preamble, and avoid including descriptive or scene-setting content that doesn't contribute to decision-making.
Simplify the formatting: Minimise complex formatting. Remove paragraph and chapter numbering, convert multi-column text into a single, well-structured and fluid document.
Break the document into smaller parts: If needed, dissect your document into smaller, manageable segments. Each part should describe a specific piece of expertise. This approach often leads to better results when applied iteratively.

The following type of documents should work well:

Standard Operating Procedures (SOPs) with clear process flows ✅
Regulatory compliance documents with explicit requirements ✅
Technical specifications with clear dependencies and constraints ✅
Policy manuals with well-defined rules and conditions ✅
Service Level Agreements (SLAs) with specific criteria and thresholds ✅
Decision trees and decision matrices in documented form ✅

Documents that are less likely to perform well include:

Documents that are highly visual, with meaning held in graphs, charts or complex tables ❌
Academic-style papers with multiple column layouts, complex mathematical calculations or references and citations that break-up the main text flow ❌
Scanned documents with poor OCR quality ❌
Documents with implicit knowledge assumptions ❌
Highly contextual or situational guidelines ❌
Documentation with undefined jargon or acronyms ❌

Important Information

Data Security

Generating from documentation uses third-party AI services (Anthropic hosted in AWS Bedrock). However this data will be encrypted in transit and at rest by AWS Bedrock and will not be used for training purposes. Data will be processed within the EU and continues to be ISO27001 compliant.

Limitations

Token limits apply. For large content and/or large knowledge graphs, limits may be reached and incomplete RBLang returned resulting in error.
Service may be interrupted during peak periods.
Varying behaviour: LLMs are not deterministic so you will likely see different responses when testing the same content.
RBLang errors: whilst we continue to make improvements, there are some of the common errors created during this process. You can ask Co-author to fix some of these (copy and paste the error to Co-author), but some may need to be fixed manually. Some of these can be fixed in the Editor UI, whilst others will require fixing directly in RBlang (accessed via the </> code panel).

Common errors

Rules that reference instances that don’t exist
1. Identify the relationship of this rule, identify the concept it’s attached to then add the instance to this concept.
Incorrect use of variables in the rule header
1. You may see Subject=”%S” or Subject=”%PERSON” in the rule header (the start of the rule). This is invalid, as variables can only be used in conditions. Remove any subject and/or object from the rule header that contains a variable. If it used a custom variable (e.g. %PERSON) it may also have used this in the conditions below, when it should use the specially designated variables of %S for the subject or %O for the object. These will need to be updated.
Relationships referencing concepts that don’t exist
1. Often these tend to be called something generic like String, Date, Number or Truth. You will need to review the graph, determine what the concept should be called, create one with this name then update the relationship to refer to your new concept.
Invalid expressions
1. Unsupported expressions can sometimes be generated, which can be manually fixed in the UI. Use our documentation to help create a valid expression. Where an expression contains multiple OR conditions, this can sometimes indicate the rule needs to be split into multiple rules. You can duplicate the rule and modify them, or prompt Co-author to separate these rules into distinctive rules.

Once fixed and you can save successfully, create a version of this so you can return to it if required.

Last updated 15 days ago