Skip to content

Resource Accounting

Key Points

  • Tensors are the core data structure in machine learning, used for:
  • Parameters
  • Gradients
  • Optimizer states
  • Data activations

Tensor Memory Types

  • float32 (single precision):
  • Standard for full precision in ML
  • Good balance of range and accuracy

  • float16 (half precision):

  • Uses less memory
  • Not ideal for very small numbers

  • bfloat16 (brain floating point):

  • Same memory as float32
  • More bits for exponent, fewer for fraction

  • fp8 (8-bit floating point):

  • Very low memory usage

Memory usage depends on: - Number of values - Data type of each value

Training Implications

  • Training with float32 is stable but memory-intensive
  • Training with float16, bfloat16, or fp8 saves memory but can cause instability
  • Mixed precision training is common (e.g., float32 for attention, bfloat16 for feed-forward)
  • Tensors are on CPU by default; use GPU for acceleration

PyTorch Example: Device & Memory

import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Device: {device}")
num_gpus = torch.cuda.device_count()
for i in range(num_gpus):
    properties = torch.cuda.get_device_properties(i)
    print(properties)
memory_allocation = torch.cuda.memory_allocated() if device == 'cuda' else None
print(f"Memory allocated: {memory_allocation}")

How Tensors Work in PyTorch

  • Tensors are pointers to allocated memory
  • Metadata describes how to access each element