Compute in Oracle Cloud Infrastructure

  • OCI Compute lets you provision & manage “instances” on-demand.
  • Instances are of 2 types:
    • Bare Metal (BM):
      • Dedicated physical server.
      • Highest performance & isolation.
    • Virtual Machine (VM):
      • Multiple isolated VMs run on bare metal hardware.
      • Runs on same hardware as BM instances.
  • OCI uses Ksplice to update hypervisor kernel without reboot. No need to pause VMs. All hypervisors supported.

Instance Shapes in OCI Compute

A “shape” is a prepackaged combination of CPU, memory & network resources.

Standard Shapes

  • For general purpose workloads.
  • Provide a balance of CPU cores, memory & network resources.
  • Come with Intel or AMD processors.

Dense IO Shapes

  • For large databases, big data workloads & apps that need high performance local storage.
  • Include locally-attached NVMe-based SSDs.

GPU Shapes

  • For hardware accelerated workloads.
  • Include intel CPUs & NVIDIA GPUs.

HPC Shapes

  • High performance computing.
  • For workloads that need high frequency processor cores / cluster networking.
  • Available only for BM instances.

Flexible Shapes

  • Customize allocated OCPUs.
  • Memory, network bandwidth & VNICs scale proportionately with OCPUs.

Components Required to Launch an Instance

  • AD: Data center where instance resides.
  • VCN + Subnet
  • Tags (optional).
  • SSH key pair for Linux.
  • Password for Windows:
    • Auto-generated by OCI on instance creation.
    • Must be changed on first login.
  • Image:
    • Template of VHD — Virtual Hard Drive
    • Determines OS & preinstalled software.
    • Can be 1 of these:
      • Oracle-provided images.
      • Trusted 3rd-party images.
      • Prebuilt Oracle enterprise images.
      • Custom images.
      • BYOI — Bring Your Own Image
      • Boot volumes.
  • Shape: Determines CPU, memory, etc.
  • BV — Block volumes (optional).

Storage for Compute Instances

  • Block Volume:
    • Dynamically provision “blocks” of storage.
    • A volume can be attached to 1 or more compute instances.
  • File Storage:
    • NFS — Network File System
    • Durable, scalable, secure, enterprise-grade.
    • Connect to it from 1 or more instances.
  • Object Storage:
    • Internet-scale, high-performance.
    • Unlimited capacity.
    • For unstructured data of any content type.
    • This is regional storage — not tied to a compute instance.
  • Archive Storage:
    • Same as object storage but for data that doesn’t require instantaneous retrieval.

Best Practices for Compute Instances

Oracle-Reserved IPs

  • 169.254.0.0/16 is used for:
    • iSCSI connections to boot & block volumes.
    • Instance metadata.
    • Etc.
  • Class D IPs:
    • 224.0.0.0 — 239.255.255.255
    • Reserved for multicast address assignments.
  • Class E IPs:
    • 240.0.0.0 — 255.255.255.255
    • Reserved for future use.
  • 3 IPs in Each Subnet:
    • 1st IP in CIDR — Network Address
    • Last IP in CIDR — Broadcast Address
    • 1st Host Address — Default Gateway

Firewall Rules

  • Oracle-provided images have preconfigured firewall rules.
  • These allow only root on Linux & Administrators on Windows to connect to iSCSI endpoints — 169.254.0.2:3260 & 169.254.2.0/24:3260.
  • Don’t remove these rules. If you do, non-admins can access boot disk.
  • Don’t create custom images without these rules.
  • Don’t enable Uncomplicated Firewall (UFW) on Ubuntu.

Resilience Best Practices

  • Keep redundant instances in different ADs.
  • Create custom image after every system modification.
  • Back up regularly.
  • If hardware fails, launch instance from custom image & apply backups.

Never Lose Access to Instance

  • A DHCP client runs on every instance.
  • It leases its IP from DHCP server.
  • It renews its lease every 24 hours.
  • Never stop the DHCP client. If you do, IP lease won’t renew & you lose access.
  • Disabling NetworkManager also stops DHCP client.
  • Stopping DHCP client might remove host route table when lease expires.
  • Loss of iSCSI connectivity might result in loss of boot drive.

Misc Best Practices

  • Default user is opc on Oracle-provided Linux & Windows images.
  • Don’t share SSH keys. Create more SSH-enabled users.
  • Use Oracle-provided NTP server.
  • Each AD has 3 FDs.
  • If your app has 2 web servers & 2 DB servers, place 1 of each in 1 FD & the other 1 in different FDs:
  • If your app has 1 web server & 1 DB server, put both in same AD.

Customer-Managed VM Maintenance

  • When your VM’s hardware needs maintenance, Oracle notifies you.
  • In maintenance window, your VM is migrated to other hardware.
  • You can reboot your VM before that to migrate.

Protecting Data on NVMe Devices

  • Some shapes include locally-attached NVMe devices.
  • These provide low latency, high performance block storage.
  • Ideal for big data, OLTP, etc.
  • These are not protected in any way: no backups, images, RAIDs, nothing.
  • To find these, run lsblk & look for nvme.

Protect Against Device Failure

  • Use RAID array:
    • RAID 1: Exact copy of data on 2 or more disks.
    • RAID 10: Stripes data across multiple mirrored pairs.
    • RAID 6: Block-level striping with 2 parity blocks distributed across all member disks.
  • To get notified of device failure, set MAILADDR in /etc/mdadm.conf & run mdadm monitor as a daemon.
  • To simulate device failure, run sudo mdadm /dev/md0 --fail /dev/nvme0n1.

Protect Against Loss of Instance or AD

  • Replication:
    • Replicate data to another instance in another AD.
    • Lowest RTO RPO. Highest cost.
    • For Oracle DBs, use built-in Data Guard.
    • Replication can be sync or async.
    • Use DRDB for general purpose block replication.
  • Backups:
    • RTO RPO significantly higher.
    • Costs significantly lower.
    • Don’t store backups in same AD.

Protect Against Data Corruption/Loss from App/User Error

  • Snapshots:
    • Either use file system that supports snapshots, like ZFS.
    • Or use LVM to manage snapshots.
    • Performance may significantly degrade when taking LVM snapshot.
  • Backups:
    • Don’t store backups in same AD.

Boot Volumes

  • Auto-created at instance launch in same compartment.
  • Associated with instance until instance is terminated.
  • Can optionally preserve BV when terminating instance.
  • BV of 1 instance can be used to launch another instance of different shape/type.
    • This way you can switch from BM to VM & vice-versa.
    • Or scale up/down an instance.
  • To repair BV, stop instance, detach BV, attach BV to another instance as data volume, repair it & reattach back to original instance.
  • Encrypted at-rest by default.
  • In-transit encryption for boot/block volumes is available only for VMs launched from Oracle-provided images.
  • Boot & block volumes can be grouped into a volume group.
    • This makes it possible to backup/clone entire instance (system + storage disks in 1 go).
  • When launching instance, provide BV size:
    • 50 GB (default) to 32 TB for Linux.
    • 256 GB (default) to 32 TB for Windows.
    • If non-default size is provided, extend the partition on the volume.
  • Cannot resize boot volume after launching instance.
  • BV performance can be switched between “balanced” & “high performance” any time.
  • Backups:
    • Block volume service’s backups feature takes crash-consistent backups — point-in-time snapshot w/o app interruption/downtime.
    • Can backup BV that’s attached or detached.
    • Backups can be:
      • Manual or scheduled by policy.
      • Full or incremental.
    • Volume’s tags are applied to volume’s backups.
    • Backup’s tags are applied to volume restored from it.
    • Backup might be larger than source volume because:
      • Many OSes zero-out entire volume. This marks those blocks as initialized, so backup includes them.
      • Backup includes up to 1 GB of metadata.
    • Backups can be copied across regions for BC/DR, migration/expansion.
    • To copy cross-region, you need permissions to read & copy boot volume backups in source region & create boot volume backups in destination region.
    • Limits of cross-region copy:
      • You can only copy 1 backup at a time from a source region.
      • You can only copy boot volume backups for instances based on Oracle-provided images.
      • The shape compatibility list is from the source region & cannot be changed.
      • When you create an instance from the Console & specify a boot volume backup that was copied from another region as the image source, you may get a message indicating that there was an error loading the source image. Ignore the error & continue!
  • Clones:
    • Copy a boot volume without going through backup & restore process.
    • Clone is a point-in-time direct disk-to-disk deep copy of the source boot volume.
    • Any subsequent changes to the data on the source volume are not copied to the clone.
    • Clone is same size as source volume but you can specify larger size when cloning if required.
    • Clone happens immediately. You can use clone right away.
    • You can only create 1 clone at a time from a volume.
      • No backups should be running either.
    • Clone must be in same region, AD & tenant but can be in different compartment.

Volume Backup vs Clone

BACKUPCLONE
Description:Point-in-time backup of data.
Can restore multiple volumes from a backup.
Single point-in-time copy of volume.
Same as backup + restore.
Use case:Backup data to duplicate environment or preserve data.
Meet compliance & regulatory requirements.
Support DR/BC.
Rapidly duplicate an env e.g. prod to dev.
Speed:Minutes / HoursSeconds
Cost:LowHigh
Storage location:Object StorageBlock Volume
Retention policy:Policy-based backups expire.
Manual backups don’t expire.
No expiry.
Volume groups:SupportedSupported