Repsoft Document Extractor API

Enterprise-grade document processing API for automated data extraction. Seamlessly integrate with your ERP systems for intelligent document automation.

Quick Start

Overview

The Repsoft Document Extractor API provides enterprise-grade document processing capabilities with asynchronous workflows. Submit documents for intelligent data extraction and receive immediate tracking identifiers for status monitoring.

Base URL

http://hub.api.cefero.com

Core Capabilities

  • High-Volume Processing: Handle files up to 500MB with optimized upload speeds
  • Asynchronous Architecture: Non-blocking operations with immediate tracking ID response
  • Real-Time Monitoring: Poll-based status updates for process visibility
  • Intelligent Queuing: Automatic load balancing and priority handling
  • Structured Results: JSON-formatted extracted data ready for ERP integration
  • Enterprise Security: Bearer token authentication with optional client key override

Authentication

All API requests require authentication headers for authorization and client identification. These headers must be included in every request.

Need API Access?

Don't have API credentials yet? Contact us to request API access and we'll set up your account with the necessary authentication tokens.

Required Headers

Header Name Value Description
Authorization cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw Production API authentication token
X-ClientId <Your Company Name> Your organization identifier

Example Request Headers

Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Important

Replace <Your Company Name> with your actual organization name. Both headers are required for all API endpoints.

Security Best Practices

Store the Authorization token securely using environment variables or secret management systems. Never commit credentials to version control or expose them in client-side code.

Upload File

POST /v1/files

Submit a file for processing. Returns immediately with a tracking ID and queues the file for extraction.

Request Headers

Header Type Required Description
Authorization string Required API authentication token
X-ClientId string Required Your organization identifier

Request Body (multipart/form-data)

Field Type Required Description
UploadFiles file Required File to upload (max 500MB)

Example Request

POST /v1/files HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary

------WebKitFormBoundary
Content-Disposition: form-data; name="UploadFiles"; filename="document.pdf"
Content-Type: application/pdf

[binary file data]
------WebKitFormBoundary--

Example Response (202 Accepted)

{
  "fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
  "status": "queued",
  "message": "File upload queued for processing",
  "queuedAt": "2025-11-24T10:30:00Z",
  "fileName": "document.pdf",
  "fileSize": 1048576
}

Response Codes

202 Accepted File successfully queued for processing
400 Bad Request Invalid file or request parameters
401 Unauthorized Missing or invalid authentication token
413 Payload Too Large File exceeds 500MB size limit

Get File Status

GET /v1/files/{file_id}

Check the current processing status of your file using its unique identifier.

Path Parameters

Parameter Type Required Description
file_id string (GUID) Required The unique identifier returned from the upload endpoint

Request Headers

Header Type Required Description
Authorization string Required API authentication token
X-ClientId string Required Your organization identifier

Example Request

GET /v1/files/a3b8c9d0-1234-5678-90ab-cdef12345678 HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name

Example Response (200 OK - Completed)

{
  "fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
  "status": "completed",
  "fileName": "document.pdf",
  "fileSize": 1048576,
  "uploadedAt": "2025-11-24T10:30:00Z",
  "completedAt": "2025-11-24T10:30:45Z",
  "processingTime": 45,
  "results": {
    "processed": true,
    "metadata": {
      "pages": 10,
      "format": "PDF"
    }
  }
}

Example Response (202 Accepted - Processing)

{
  "fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
  "status": "processing",
  "fileName": "document.pdf",
  "uploadedAt": "2025-11-24T10:30:00Z",
  "message": "File is currently being processed"
}

Response Codes

200 OK File processing completed successfully
202 Accepted File is queued or currently processing
404 Not Found File ID not found
500 Internal Server Error File processing failed

Get Recent Results

GET /v1/files/results

Retrieve a list of your recently completed documents with their processing results.

Query Parameters

Parameter Type Required Description
limit integer Optional Number of recent files to retrieve (1-10, default: 10)

Request Headers

Header Type Required Description
Authorization string Required API authentication token
X-ClientId string Required Your organization identifier

Example Request

GET /v1/files/results?limit=5 HTTP/1.1
Host: hub.api.cefero.com
Authorization: cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw
X-ClientId: Your Company Name

Example Response (200 OK)

{
  "totalResults": 5,
  "limit": 5,
  "files": [
    {
      "fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
      "status": "completed",
      "fileName": "document1.pdf",
      "fileSize": 1048576,
      "uploadedAt": "2025-11-24T10:30:00Z",
      "completedAt": "2025-11-24T10:30:45Z",
      "processingTime": 45
    },
    {
      "fileId": "b4c9d0e1-2345-6789-01bc-def123456789",
      "status": "completed",
      "fileName": "document2.pdf",
      "fileSize": 2097152,
      "uploadedAt": "2025-11-24T09:15:00Z",
      "completedAt": "2025-11-24T09:16:30Z",
      "processingTime": 90
    }
  ]
}

Response Codes

200 OK Successfully retrieved recent results
400 Bad Request Invalid query parameters
401 Unauthorized Missing or invalid authentication token

.NET SDK (Cefero.ApiSdk)

The official .NET SDK for Cefero Document Extractor API. A NuGet package that simplifies integration with strongly-typed models, async/await support, and comprehensive error handling.

Why Use the SDK?

  • Simplified Integration: No need to manage HTTP requests manually - the SDK handles all API communication
  • Strong Typing: Full IntelliSense support with typed request/response models
  • Built-in Error Handling: Comprehensive exception handling and retry logic
  • Async/Await Support: Modern async patterns for non-blocking operations
  • Production Ready: Battle-tested in enterprise environments

SDK Features

Installation

Install the Cefero.ApiSdk package from NuGet using one of the following methods:

Package Manager Console

Install-Package Cefero.ApiSdk

.NET CLI

dotnet add package Cefero.ApiSdk

Package Reference (Manual)

Add this to your .csproj file:

<PackageReference Include="Cefero.ApiSdk" Version="1.0.4" />
Requirements
  • .NET 6.0 or higher
  • Compatible with .NET 8.0, .NET 9.0, and later versions
  • Works with ASP.NET Core, Console Apps, Windows Services, and more

Configuration

Configure the SDK by creating a CeferoClient instance with your API credentials.

Basic Configuration

using Cefero.ApiSdk;

// Initialize the client
var ceferoClient = new CeferoClient(
    apiKey: "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
    clientKey: "Your Company Name",
    baseUrl: "http://hub.api.cefero.com"
);

Configuration with appsettings.json

Recommended approach for production applications:

1. Create appsettings.json

{
  "Cefero": {
    "ApiKey": "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
    "ClientKey": "Your Company Name",
    "BaseUrl": "http://hub.api.cefero.com"
  }
}

2. Create Configuration Classes

public class CeferoSettings
{
    public string ApiKey { get; set; } = string.Empty;
    public string ClientKey { get; set; } = string.Empty;
    public string BaseUrl { get; set; } = "http://hub.api.cefero.com";
}

public class AppConfiguration
{
    public CeferoSettings Cefero { get; set; } = new();

    public static AppConfiguration Load()
    {
        var configuration = new ConfigurationBuilder()
            .SetBasePath(AppContext.BaseDirectory)
            .AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
            .AddEnvironmentVariables()
            .Build();

        var config = new AppConfiguration();
        configuration.Bind(config);
        return config;
    }
}

3. Initialize Client from Configuration

var config = AppConfiguration.Load();

var ceferoClient = new CeferoClient(
    config.Cefero.ApiKey,
    config.Cefero.ClientKey,
    config.Cefero.BaseUrl
);
Security Best Practice

For production environments, store sensitive credentials in environment variables or Azure Key Vault instead of appsettings.json. The SDK supports environment variable overrides automatically.

Dependency Injection Setup (ASP.NET Core)

// In Program.cs or Startup.cs
builder.Services.Configure<CeferoSettings>(
    builder.Configuration.GetSection("Cefero"));

builder.Services.AddSingleton<ICeferoClient>(sp =>
{
    var settings = sp.GetRequiredService<IOptions<CeferoSettings>>().Value;
    return new CeferoClient(
        settings.ApiKey,
        settings.ClientKey,
        settings.BaseUrl
    );
});

SDK Usage Examples

The SDK provides three main operations: uploading files, checking file status, and retrieving all results.

Quick Start Example

The simplest way to get started with the SDK:

using Cefero.Sdk;

var options = new CeferoClientOptions
{
    BaseUri = new Uri("https://your-cefero-host/"), // base_url
    ApiKey = "<api-key>",
    ClientKey = "<client-key>",
    UploadRelativeUrl = "v1/files"                  // matches POST {{base_url}}/v1/files
};

using var httpClient = new HttpClient { BaseAddress = options.BaseUri };

ICeferoClient ceferoClient = new CeferoClient(httpClient, options);

var result = await ceferoClient.UploadAsync(
    filePath: "Doorways 2 (2).pdf",
    task: CeferoTaskType.Invoice);

Console.WriteLine($"TaskId: {result.TaskId}, Status: {result.Status}");

1. Upload a File

Upload a document for processing and receive a tracking ID:

using Cefero.ApiSdk;

var ceferoClient = new CeferoClient(
    "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
    "Your Company Name",
    "http://hub.api.cefero.com"
);

// Upload a single file
var filePath = @"C:\Documents\invoice.pdf";
var result = await ceferoClient.UploadFileAsync(filePath);

Console.WriteLine($"File ID: {result.FileId}");
Console.WriteLine($"Status: {result.Status}");
Console.WriteLine($"Message: {result.Message}");

Upload Multiple Files

var files = new[]
{
    @"C:\Documents\invoice1.pdf",
    @"C:\Documents\invoice2.pdf",
    @"C:\Documents\receipt.jpg"
};

foreach (var file in files)
{
    try
    {
        var result = await ceferoClient.UploadFileAsync(file);
        Console.WriteLine($"✓ {Path.GetFileName(file)} - ID: {result.FileId}");
    }
    catch (Exception ex)
    {
        Console.WriteLine($"✗ {Path.GetFileName(file)} - Error: {ex.Message}");
    }
}

2. Check File Status

Query the processing status and retrieve results once completed:

var fileId = "a3b8c9d0-1234-5678-90ab-cdef12345678";
var result = await ceferoClient.GetFileStatusAsync(fileId);

if (!result.Success)
{
    Console.WriteLine($"Error: {result.Message}");
    return;
}

foreach (var file in result.Data)
{
    Console.WriteLine($"File: {file.FileName}");
    Console.WriteLine($"Status: {file.Status}");
    Console.WriteLine($"Queued At: {file.QueuedAt}");

    // Access OCR results if processing completed
    if (file.Result?.OcrResult != null)
    {
        var ocr = file.Result.OcrResult;

        // Display header fields
        if (ocr.Header?.Fields != null)
        {
            Console.WriteLine("\nHeader Information:");
            foreach (var field in ocr.Header.Fields)
            {
                Console.WriteLine($"  {field.Key}: {field.Value}");
            }
        }

        // Display page data
        if (ocr.Pages != null)
        {
            Console.WriteLine($"\nTotal Pages: {ocr.Pages.Count}");

            foreach (var page in ocr.Pages)
            {
                Console.WriteLine($"\nPage {page.Page}:");
                if (page.Lines != null)
                {
                    foreach (var line in page.Lines)
                    {
                        if (line.Fields != null)
                        {
                            foreach (var field in line.Fields)
                            {
                                Console.WriteLine($"    {field.Key}: {field.Value}");
                            }
                        }
                    }
                }
            }
        }
    }
}

Polling Pattern for Completion

public async Task<FileData> WaitForCompletionAsync(
    string fileId,
    int maxWaitSeconds = 300,
    CancellationToken cancellationToken = default)
{
    var elapsed = 0;
    var pollInterval = 2; // seconds

    while (elapsed < maxWaitSeconds)
    {
        var result = await ceferoClient.GetFileStatusAsync(fileId, cancellationToken);

        if (result.Success && result.Data.Count > 0)
        {
            var file = result.Data[0];

            if (file.Status == "completed")
                return file;

            if (file.Status == "failed")
                throw new Exception($"Processing failed: {file.Error}");
        }

        await Task.Delay(pollInterval * 1000, cancellationToken);
        elapsed += pollInterval;
    }

    throw new TimeoutException($"Processing timeout after {maxWaitSeconds} seconds");
}

// Usage
var uploadResult = await ceferoClient.UploadFileAsync("invoice.pdf");
var completedFile = await WaitForCompletionAsync(uploadResult.FileId);
Console.WriteLine($"Processing completed in {completedFile.Result.UploadedAt}");

3. Get All Results

Retrieve a list of recently processed files:

// Get all results (default: last 10 files)
var results = await ceferoClient.GetAllResultsAsync();

Console.WriteLine($"Total Files: {results.Data.Count}");

foreach (var file in results.Data)
{
    Console.WriteLine($"\nFile: {file.FileName}");
    Console.WriteLine($"  ID: {file.FileId}");
    Console.WriteLine($"  Status: {file.Status}");
    Console.WriteLine($"  Size: {file.FileSize:N0} bytes");
    Console.WriteLine($"  Uploaded: {file.QueuedAt}");

    if (file.Result?.OcrResult != null)
    {
        var ocr = file.Result.OcrResult;
        Console.WriteLine($"  Pages: {ocr.Pages?.Count ?? 0}");
        Console.WriteLine($"  Header Fields: {ocr.Header?.Fields?.Count ?? 0}");
    }
}

With Custom Page Size

// Get last 5 results
var results = await ceferoClient.GetAllResultsAsync(pageSize: 5);

// Get last 20 results
var moreResults = await ceferoClient.GetAllResultsAsync(pageSize: 20);

Complete Application Example

Full console application demonstrating all SDK features:

using Cefero.ApiSdk;

class Program
{
    static async Task Main(string[] args)
    {
        var client = new CeferoClient(
            "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
            "Your Company Name",
            "http://hub.api.cefero.com"
        );

        try
        {
            // 1. Upload a file
            Console.WriteLine("Uploading file...");
            var uploadResult = await client.UploadFileAsync("invoice.pdf");
            Console.WriteLine($"✓ File uploaded - ID: {uploadResult.FileId}");

            // 2. Wait for processing
            Console.WriteLine("Waiting for processing...");
            var fileData = await WaitForCompletion(client, uploadResult.FileId);
            Console.WriteLine("✓ Processing completed");

            // 3. Display results
            if (fileData.Result?.OcrResult != null)
            {
                var ocr = fileData.Result.OcrResult;
                Console.WriteLine($"\nExtracted Data:");
                Console.WriteLine($"  Pages: {ocr.Pages?.Count ?? 0}");

                if (ocr.Header?.Fields != null)
                {
                    Console.WriteLine("  Header:");
                    foreach (var field in ocr.Header.Fields)
                        Console.WriteLine($"    {field.Key}: {field.Value}");
                }
            }

            // 4. Get all recent results
            Console.WriteLine("\nFetching recent results...");
            var allResults = await client.GetAllResultsAsync(pageSize: 5);
            Console.WriteLine($"✓ Found {allResults.Data.Count} recent files");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"✗ Error: {ex.Message}");
        }
    }

    static async Task<FileData> WaitForCompletion(
        ICeferoClient client,
        string fileId)
    {
        for (int i = 0; i < 60; i++) // Max 2 minutes
        {
            var result = await client.GetFileStatusAsync(fileId);

            if (result.Success && result.Data.Count > 0)
            {
                var file = result.Data[0];
                if (file.Status == "completed") return file;
                if (file.Status == "failed")
                    throw new Exception($"Failed: {file.Error}");
            }

            await Task.Delay(2000); // Wait 2 seconds
        }

        throw new TimeoutException("Processing timeout");
    }
}

Error Handling

The SDK provides comprehensive error handling:

try
{
    var result = await ceferoClient.UploadFileAsync(filePath);

    if (!result.Success)
    {
        Console.WriteLine($"Upload failed: {result.Message}");
        return;
    }

    Console.WriteLine($"Success: {result.FileId}");
}
catch (FileNotFoundException ex)
{
    Console.WriteLine($"File not found: {ex.Message}");
}
catch (HttpRequestException ex)
{
    Console.WriteLine($"Network error: {ex.Message}");
}
catch (UnauthorizedAccessException ex)
{
    Console.WriteLine($"Authentication failed: {ex.Message}");
}
catch (Exception ex)
{
    Console.WriteLine($"Unexpected error: {ex.Message}");
}
Sample Projects Available

Complete working examples are available in the SDK repository:

  • Api Usage test: Demonstrates file upload, status checking, and results retrieval
  • Api Get all result: Shows how to retrieve and display all processed results
  • Both projects include configuration management, logging, and error handling

ICeferoClient Interface

The SDK exposes the following interface for easy testing and dependency injection:

public interface ICeferoClient
{
    Task<UploadResponse> UploadFileAsync(
        string filePath,
        CancellationToken cancellationToken = default);

    Task<ApiResponse<List<FileData>>> GetFileStatusAsync(
        string fileId,
        CancellationToken cancellationToken = default);

    Task<ApiResponse<List<FileData>>> GetAllResultsAsync(
        int? pageSize = null,
        CancellationToken cancellationToken = default);
}

ERP Integration Guide

Learn how to seamlessly integrate the Document Extractor API into your ERP system for automated document processing and data extraction.

Recommended for .NET Applications

If you're building a .NET-based ERP integration, we strongly recommend using the Cefero.ApiSdk NuGet package instead of implementing HTTP calls manually. The SDK provides type safety, error handling, and simplified configuration.

Integration Overview

The API is designed for easy integration with any ERP system. The asynchronous processing model allows your ERP to submit documents and continue operations without blocking.

Authentication Setup

Every API request must include authentication headers. These credentials identify your organization and authorize access to the Document Extractor service.

Required Headers for All Requests

The following headers must be included in every API call:

Header Name Value Description
Authorization cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw Production API authentication token
X-ClientId <Your Company Name> Your organization identifier (replace with your actual company name)

Implementation Example

Here's how to configure these headers in different platforms:

// .NET Example
var client = new HttpClient();
client.DefaultRequestHeaders.Add("Authorization", "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw");
client.DefaultRequestHeaders.Add("X-ClientId", "Your Company Name");

// Angular/TypeScript Example
const headers = new HttpHeaders({
  'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
  'X-ClientId': 'Your Company Name'
});

// Python Example
headers = {
    'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
    'X-ClientId': 'Your Company Name'
}
Important Notes
  • Replace <Your Company Name> with your actual organization name
  • Store the Authorization token securely (use environment variables or secret management)
  • These headers are required for all endpoints: upload, status check, and results retrieval
  • Missing or invalid headers will result in a 401 Unauthorized response
  • Don't have API access? Request your API credentials here

Recommended Workflow

  1. Document Capture - Monitor incoming documents (invoices, purchase orders, receipts) through your ERP's document upload module
  2. API Upload - Submit documents to the API endpoint and receive tracking IDs immediately
  3. Status Tracking - Implement polling or webhook handlers to monitor processing status
  4. Data Extraction - Retrieve structured data once processing completes
  5. ERP Mapping - Map extracted fields to your ERP's data model (GL accounts, vendors, line items)
  6. Validation - Implement business rule validation before committing to your ERP database
  7. Auto-Posting - Create transactions automatically or queue for human approval

Integration Patterns

Pattern 1: Real-Time Processing

Upload documents as soon as they arrive in your ERP and poll for results. Suitable for low-volume scenarios.

Pattern 2: Batch Processing

Queue documents throughout the day and process them in scheduled batches (e.g., hourly or end-of-day). Ideal for high-volume operations.

Pattern 3: Event-Driven Integration

Use background workers or message queues to handle uploads and status checks asynchronously. Recommended for enterprise deployments.

Data Mapping Considerations

Map the API's extracted data to your ERP's data structures:

  • Vendor Information: Match extracted vendor names/IDs to your vendor master file
  • GL Accounts: Apply business logic to assign proper general ledger account codes
  • Cost Centers: Route documents to appropriate departments or cost centers
  • Line Items: Parse and validate individual transaction lines
  • Tax Handling: Map extracted tax amounts to your tax codes and jurisdictions
  • Document Metadata: Store file IDs, processing dates, and audit trails

Integration Examples

.NET ERP Integration (Using SDK)

Background service for processing documents in your .NET-based ERP using the Cefero SDK:

using Cefero.ApiSdk;

// Document processing service using Cefero SDK
public class DocumentProcessingService : BackgroundService
{
    private readonly ICeferoClient _ceferoClient;
    private readonly IErpRepository _erpRepo;
    private readonly ILogger<DocumentProcessingService> _logger;

    public DocumentProcessingService(
        ICeferoClient ceferoClient,
        IErpRepository erpRepo,
        ILogger<DocumentProcessingService> logger)
    {
        _ceferoClient = ceferoClient;
        _erpRepo = erpRepo;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            var pendingDocs = await _erpRepo.GetPendingDocumentsAsync();

            foreach (var doc in pendingDocs)
            {
                try
                {
                    // Upload to API using SDK
                    var uploadResult = await _ceferoClient.UploadFileAsync(
                        doc.FilePath, stoppingToken);

                    await _erpRepo.UpdateFileIdAsync(doc.Id, uploadResult.FileId);

                    // Poll for completion and map to ERP
                    var result = await WaitForCompletionAsync(
                        uploadResult.FileId, stoppingToken);

                    await MapAndSaveToErpAsync(result);

                    _logger.LogInformation(
                        "Successfully processed document {FileName}", doc.FileName);
                }
                catch (Exception ex)
                {
                    _logger.LogError(ex,
                        "Failed to process document {FileName}", doc.FileName);
                    await _erpRepo.MarkAsFailedAsync(doc.Id, ex.Message);
                }
            }

            await Task.Delay(TimeSpan.FromMinutes(5), stoppingToken);
        }
    }

    private async Task<FileData> WaitForCompletionAsync(
        string fileId,
        CancellationToken cancellationToken)
    {
        for (int i = 0; i < 150; i++) // Max 5 minutes
        {
            var result = await _ceferoClient.GetFileStatusAsync(
                fileId, cancellationToken);

            if (result.Success && result.Data.Count > 0)
            {
                var file = result.Data[0];
                if (file.Status == "completed") return file;
                if (file.Status == "failed")
                    throw new Exception($"Processing failed: {file.Error}");
            }

            await Task.Delay(2000, cancellationToken);
        }

        throw new TimeoutException("Processing timeout");
    }

    private async Task MapAndSaveToErpAsync(FileData fileData)
    {
        var ocrResult = fileData.Result?.OcrResult;
        if (ocrResult == null) return;

        // Map OCR results to ERP data model
        var erpDocument = new ErpDocument
        {
            OriginalFileId = fileData.FileId,
            FileName = fileData.FileName,
            ProcessedDate = DateTime.UtcNow
        };

        // Extract header information
        if (ocrResult.Header?.Fields != null)
        {
            foreach (var field in ocrResult.Header.Fields)
            {
                // Map to ERP fields based on your business logic
                switch (field.Key.ToLower())
                {
                    case "vendor":
                        erpDocument.VendorName = field.Value;
                        break;
                    case "invoice_number":
                        erpDocument.InvoiceNumber = field.Value;
                        break;
                    case "total_amount":
                        erpDocument.TotalAmount = decimal.Parse(field.Value);
                        break;
                }
            }
        }

        // Save to ERP database
        await _erpRepo.SaveDocumentAsync(erpDocument);
    }
}

// Configure in Program.cs or Startup.cs
builder.Services.AddSingleton<ICeferoClient>(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();
    return new CeferoClient(
        config["Cefero:ApiKey"],
        config["Cefero:ClientKey"],
        config["Cefero:BaseUrl"]
    );
});

builder.Services.AddHostedService<DocumentProcessingService>();

Angular ERP Dashboard

Upload component for web-based ERP interface:

// Service with authentication
@Injectable({ providedIn: 'root' })
export class DocumentExtractorService {
  private baseUrl = 'http://hub.api.cefero.com';
  private headers = new HttpHeaders({
    'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
    'X-ClientId': 'Your Company Name'
  });

  constructor(private http: HttpClient) {}

  uploadFile(file: File): Observable {
    const formData = new FormData();
    formData.append('UploadFiles', file);
    return this.http.post(`${this.baseUrl}/v1/files`, formData,
      { headers: this.headers });
  }

  getStatus(fileId: string): Observable {
    return this.http.get(`${this.baseUrl}/v1/files/${fileId}`,
      { headers: this.headers });
  }
}

// Component usage
export class ErpDocumentUploadComponent {
  constructor(
    private extractorService: DocumentExtractorService,
    private erpService: ErpDataService
  ) {}

  processDocument(file: File): void {
    this.extractorService.uploadFile(file).pipe(
      switchMap(response => this.pollUntilComplete(response.fileId)),
      switchMap(result => this.erpService.createTransaction(result))
    ).subscribe({
      next: (transaction) => this.notifySuccess(transaction),
      error: (error) => this.handleError(error)
    });
  }

  private pollUntilComplete(fileId: string): Observable {
    return interval(2000).pipe(
      switchMap(() => this.extractorService.getStatus(fileId)),
      filter(status => status.status === 'completed'),
      take(1)
    );
  }
}

Python ERP Integration

Batch processor for Python-based ERP systems:

import requests
import time

class ErpDocumentProcessor:
    def __init__(self, erp_db):
        self.base_url = "http://hub.api.cefero.com"
        # Required authentication headers
        self.headers = {
            'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
            'X-ClientId': 'Your Company Name'
        }
        self.db = erp_db

    def upload_file(self, file_path):
        """Upload a file to the API"""
        with open(file_path, 'rb') as f:
            files = {'UploadFiles': f}
            response = requests.post(f'{self.base_url}/v1/files',
                headers=self.headers, files=files)
            return response.json()

    def get_status(self, file_id):
        """Check file processing status"""
        response = requests.get(f'{self.base_url}/v1/files/{file_id}',
            headers=self.headers)
        return response.json()

    def process_batch(self, document_paths):
        """Process a batch of documents and post to ERP"""
        for path in document_paths:
            # Upload to API
            response = self.upload_file(path)
            file_id = response['fileId']

            # Wait for processing
            result = self.wait_for_completion(file_id)

            # Map and validate
            erp_data = self.map_to_erp_format(result)
            if self.validate_business_rules(erp_data):
                self.db.create_invoice(erp_data)
            else:
                self.db.queue_for_review(erp_data)

    def wait_for_completion(self, file_id, max_wait=300):
        """Poll until file processing completes"""
        elapsed = 0
        while elapsed < max_wait:
            status = self.get_status(file_id)
            if status['status'] == 'completed':
                return status
            time.sleep(2)
            elapsed += 2
        raise TimeoutError(f"Processing timeout after {max_wait}s")

    def map_to_erp_format(self, api_result):
        """Map API response to ERP data model"""
        return {
            'vendor_id': self.lookup_vendor(api_result),
            'gl_account': self.determine_gl_account(api_result),
            'amount': api_result['results']['total_amount'],
            'line_items': self.parse_line_items(api_result)
        }

Best Practices for ERP Integration

Design Recommendations
  • Implement idempotent operations to handle retries safely
  • Store file IDs in your ERP database for audit trails
  • Use transaction boundaries when posting to your ERP
  • Implement error queues for failed documents
  • Log all API interactions for troubleshooting
  • Configure appropriate timeout values based on document size
  • Consider implementing a manual review queue for low-confidence extractions
  • Use environment-specific API keys (dev, staging, production)

Security Considerations

  • Store API credentials in secure configuration management systems (Azure Key Vault, AWS Secrets Manager, etc.)
  • Use separate API keys for different environments
  • Implement role-based access control for document processing functions
  • Encrypt sensitive data at rest and in transit
  • Maintain audit logs of all document processing activities

Performance Optimization

  • Process documents asynchronously using background workers or queues
  • Implement connection pooling for HTTP clients
  • Use exponential backoff for status polling
  • Cache vendor lookups and GL account mappings
  • Monitor API usage and implement rate limiting on your side

Code Examples

Quick-start code snippets for integrating the API. Each example demonstrates file upload and status checking with minimal code.

.NET Developers - Use the SDK

The examples below show manual HTTP implementations for various languages. If you're using .NET, we strongly recommend using the Cefero.ApiSdk NuGet package instead, which provides a much simpler and type-safe API.

.NET (C#) - Using SDK (Recommended)

Simplest approach for .NET applications:

// Install: dotnet add package Cefero.ApiSdk
using Cefero.ApiSdk;

var client = new CeferoClient(
    "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw",
    "Your Company Name",
    "http://hub.api.cefero.com"
);

// Upload file
var uploadResult = await client.UploadFileAsync("invoice.pdf");
Console.WriteLine($"File ID: {uploadResult.FileId}");

// Get status
var statusResult = await client.GetFileStatusAsync(uploadResult.FileId);
var file = statusResult.Data[0];
Console.WriteLine($"Status: {file.Status}");

// Get OCR results when completed
if (file.Result?.OcrResult != null)
{
    foreach (var field in file.Result.OcrResult.Header.Fields)
        Console.WriteLine($"{field.Key}: {field.Value}");
}

.NET (C#) - Manual HTTP Implementation

If you prefer not to use the SDK, here's the manual HTTP approach:

public class DocumentExtractorClient
{
    private readonly HttpClient _http;
    private const string BaseUrl = "http://hub.api.cefero.com";

    public DocumentExtractorClient()
    {
        _http = new HttpClient();
        _http.DefaultRequestHeaders.Add("Authorization",
            "cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw");
        _http.DefaultRequestHeaders.Add("X-ClientId", "Your Company Name");
    }

    public async Task UploadAsync(string filePath)
    {
        using var content = new MultipartFormDataContent();
        using var file = File.OpenRead(filePath);
        content.Add(new StreamContent(file), "UploadFiles", Path.GetFileName(filePath));

        var response = await _http.PostAsync($"{BaseUrl}/v1/files", content);
        var json = await response.Content.ReadAsStringAsync();
        return JObject.Parse(json)["fileId"].ToString();
    }

    public async Task GetStatusAsync(string fileId)
    {
        var response = await _http.GetAsync($"{BaseUrl}/v1/files/{fileId}");
        return JObject.Parse(await response.Content.ReadAsStringAsync());
    }
}

// Usage
var client = new DocumentExtractorClient();
var fileId = await client.UploadAsync("invoice.pdf");
var status = await client.GetStatusAsync(fileId);

Angular (TypeScript)

Service with RxJS for reactive programming:

@Injectable({ providedIn: 'root' })
export class DocumentExtractorService {
  private baseUrl = 'http://hub.api.cefero.com';
  private headers = new HttpHeaders({
    'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
    'X-ClientId': 'Your Company Name'
  });

  constructor(private http: HttpClient) {}

  uploadFile(file: File): Observable {
    const formData = new FormData();
    formData.append('UploadFiles', file);
    return this.http.post(`${this.baseUrl}/v1/files`, formData,
      { headers: this.headers });
  }

  getStatus(fileId: string): Observable {
    return this.http.get(`${this.baseUrl}/v1/files/${fileId}`,
      { headers: this.headers });
  }
}

// Component usage
this.service.uploadFile(file).pipe(
  switchMap(res => this.pollStatus(res.fileId))
).subscribe(result => console.log(result));

Python

Clean implementation using requests library:

import requests

class DocumentExtractorClient:
    def __init__(self):
        self.base_url = "http://hub.api.cefero.com"
        self.headers = {
            'Authorization': 'cef_prod_cd33lh9m4ln6lgrdb3o1tieeu2t8p619yfik6kqw',
            'X-ClientId': 'Your Company Name'
        }

    def upload_file(self, file_path):
        with open(file_path, 'rb') as f:
            files = {'UploadFiles': f}
            response = requests.post(f'{self.base_url}/v1/files',
                headers=self.headers, files=files)
            return response.json()

    def get_status(self, file_id):
        response = requests.get(f'{self.base_url}/v1/files/{file_id}',
            headers=self.headers)
        return response.json()

# Usage
client = DocumentExtractorClient()
result = client.upload_file("invoice.pdf")
status = client.get_status(result['fileId'])

Error Responses

Authentication Error (401)

{
  "error": "Unauthorized",
  "message": "Invalid or missing authentication token",
  "statusCode": 401
}

Validation Error (400)

{
  "error": "Bad Request",
  "message": "File is required",
  "statusCode": 400
}

Processing Error (500)

{
  "error": "Internal Server Error",
  "message": "File processing failed",
  "fileId": "a3b8c9d0-1234-5678-90ab-cdef12345678",
  "statusCode": 500
}

Best Practices

  • Status Polling: Check file status every 2-3 seconds. Increase intervals for long-running processes.
  • File Size: Split files larger than 100MB for better performance.
  • Error Handling: Implement retry logic for network failures.
  • Timeouts: Set upload timeouts based on file size (allow 1 minute per 10MB).
  • Rate Limits: Respect API limits to maintain service quality.
Security

Store API keys in environment variables or secure vaults. Never commit keys to version control.