Skip to content

notes 20220201 en mid.txt

Sanjay Aiyagari edited this page Feb 3, 2022 · 1 revision

Secure AI Fabric - 2/1/2022

Remote execution use case

  • Description
    • Scan local YAML file
    • Notice that it could run faster on some other hardware in another cluster
    • Set it up for the other location
  • This use case could be done with an operator
    • Describes memory binding
    • Data transfer description
    • Price of data transfer and storage

Catalog discussion

  • Central intelligence group will build the catalog for self-describing assets
  • Same type of catalog should be used to represent inventory of hardware to execute models

Resource management

  • Can we use a Slurm interface?
  • Similar to Kubernetes
    • Manages Disk, RAM, CPU, plus other custom resources

Remote access discussion

  • Abstraction of data access using network
  • Discussion of whether the data is loaded fully or remotely accessed
  • Explanation that it will be different for different industries
    • Financial services / Telco: OK to have data in-flight, just can't store on-disk
    • Other industries: maybe OK to have data lakes, or maybe not OK to move data at all
  • We do need to discuss the difference between applications that "use the network" and those that go "directly to the local disk" as some of them currently do

Plan for demos

  1. Remote data access - 2/15/2022
  2. Catalog - TBD
  3. AI Research - TBD