Class to handle a large language model on top of onnxruntime

Hierarchy (view full)

Constructors

Properties

eos: bigint = 2n
feed: Record<string, Tensor> = {}
outputTokens: bigint[] = []
sess?: InferenceSession

Methods

  • Generate tokens using greedy search

    Parameters

    • tokens: bigint[]

      Initial tokens

    • callback: ((tokens: bigint[]) => void)

      Callback function to handle the generated tokens

        • (tokens): void
        • Parameters

          • tokens: bigint[]

          Returns void

    • options: {
          maxTokens: number;
      }

      Generation options

      • maxTokens: number

    Returns Promise<bigint[]>

    Array of generated tokens

  • Parameters

    • model: string
    • onnx_file: string = "onnx/model.onnx"
    • options: LoadOptions

    Returns Promise<void>