- Main Performance Impacts to Games Built on Unity Engine
- How to Make GC Events Not Cause Unity to Hiccup
- Unity Specific Documentation & Resources
Checklist of the primary performance impacts devs encounter in Unity games:
- Scripting Issues:
- Inefficient Code: Placing expensive operations within frequently called methods (
Update(),LateUpdate(),FixedUpdate()) or inside tight loops.- Example (Bad): Repeatedly finding a component.
void Update() { // This searches for the Rigidbody every single frame! GetComponent<Rigidbody>().AddForce(Vector3.up * Time.deltaTime); }
- Example (Good): Caching the component reference once.
private Rigidbody _rb; void Awake() { _rb = GetComponent<Rigidbody>(); // Get it once } void Update() { _rb.AddForce(Vector3.up * Time.deltaTime); // Use the cached reference }
- Example (Bad): Repeatedly finding a component.
- Garbage Collection (GC) Spikes: Caused by frequent and unnecessary memory allocations and deallocations.
- When your code constantly creates new objects (even small ones like
Vector3if not careful, orstringconcatenations), the .NET garbage collector has to pause the main thread to clean up unused memory, leading to noticeable "hiccups" or "stutters."
- Example (Bad): Creating new temporary objects often.
void Update() { // Creates a new RaycastHit struct, often allocating related objects RaycastHit hit; if (Physics.Raycast(transform.position, transform.forward, out hit, 100f)) { // ... } }
- Example (Good): Using non-allocating versions or caching.
private RaycastHit _hitInfo; // Cache the struct void Update() { // This version does not allocate a new RaycastHit each time if (Physics.Raycast(transform.position, transform.forward, out _hitInfo, 100f)) { // ... } }
- When your code constantly creates new objects (even small ones like
- Misuse of Coroutines: Poorly managed coroutines (e.g., creating new
WaitForSecondsobjects every frame) can contribute to allocations and CPU overhead.
- Inefficient Code: Placing expensive operations within frequently called methods (
- Draw Calls/Batching:
- High Number of Draw Calls: Each time the CPU instructs the GPU to render a set of geometry (a "draw call"), there's overhead involved in preparing and sending that data. Too many individual draw calls can overwhelm the CPU.
- Mobile Recommendation: Aim for under 100-150 draw calls per frame for smooth mobile performance. High-end devices might tolerate up to 200-300, but lower is always better.
- Lack of Effective Batching: Not utilizing Unity's built-in optimization features like static batching (for stationary objects), dynamic batching (for small, moving objects with the same material), or GPU instancing (for many identical meshes with the same material) means the CPU has to prepare more unique draw calls.
- Example: Hundreds of individual tree prefabs without static batching or instancing would result in hundreds of hundreds of draw calls. If they are marked static and use the same material, Unity can batch them.
- GPU Instancing Example (Manual/Advanced):
- First, ensure your material has "Enable Instancing" checked in its inspector.
- Then, you can manually draw many meshes with one call.
// Requires a Mesh (e.g., cubeMesh), a Material with "Enable Instancing" checked public Mesh instancedMesh; public Material instancedMaterial; public int instanceCount = 1000; private Matrix4x4[] _matrices; // Array of transformation matrices for each instance private Vector4[] _instanceColors; // Array for per-instance colors void Start() { _matrices = new Matrix4x4[instanceCount]; _instanceColors = new Vector4[instanceCount]; for (int i = 0; i < instanceCount; i++) { // Set up random positions/rotations for each instance Vector3 pos = new Vector3(Random.Range(-50f, 50f), 0, Random.Range(-50f, 50f)); Quaternion rot = Quaternion.Euler(0, Random.Range(0, 360), 0); Vector3 scale = Vector3.one * Random.Range(0.5f, 1.5f); _matrices[i] = Matrix4x4.TRS(pos, rot, scale); // Assign a random color to each instance _instanceColors[i] = new Vector4(Random.value, Random.value, Random.value, 1.0f); } } void Update() { // Pass per-instance properties to the shader // The name "_InstanceColor" must match the property declared in the shader instancedMaterial.SetVectorArray("_InstanceColor", _instanceColors); // Draws all meshes with a single draw call (per material/pass) Graphics.DrawMeshInstanced(instancedMesh, 0, instancedMaterial, _matrices); }
- Note: For many typical scenes, Unity's automatic GPU Instancing works if you use compatible meshes and materials with "Enable Instancing" checked, without needing
Graphics.DrawMeshInstancedcalls.
- Shader Example for GPU Instancing (Unlit, Per-Instance Color):
- This shader shows how to declare an instanced property (
_InstanceColor) and use Unity's instancing macros.
Shader "Custom/InstancedUnlitColor" { Properties { _Color ("Color", Color) = (1,1,1,1) // Main color (fallback if no instance color) _InstanceColor ("Instance Color (Instanced)", Color) = (1,1,1,1) // Per-instance color } SubShader { Tags { "RenderType"="Opaque" } LOD 100 Pass { CGPROGRAM #pragma vertex vert #pragma fragment frag #pragma multi_compile_instancing // This line is crucial for instancing! #include "UnityCG.cginc" struct appdata { float4 vertex : POSITION; UNITY_VERTEX_INPUT_INSTANCE_ID // Required for instance ID }; struct v2f { float4 vertex : SV_POSITION; fixed4 color : COLOR; // Output color to fragment shader UNITY_VERTEX_INPUT_INSTANCE_ID // Required for instance ID }; fixed4 _Color; // Main color UNITY_INSTANCING_BUFFER_START(Props) // Start of instanced properties buffer UNITY_DEFINE_INSTANCED_PROP(fixed4, _InstanceColor) // Declare per-instance color UNITY_INSTANCING_BUFFER_END(Props) // End of buffer v2f vert (appdata v) { v2f o; UNITY_SETUP_INSTANCE_ID(v); // Setup instance ID UNITY_TRANSFER_INSTANCE_ID(v, o); // Transfer instance ID to fragment (optional if not needed there) o.vertex = UnityObjectToClipPos(v.vertex); // Get the per-instance color, fall back to _Color if instancing isn't active fixed4 instanceColor = UNITY_ACCESS_INSTANCED_PROP_ARRAY(Props, _InstanceColor, UNITY_GET_INSTANCE_ID(v)); o.color = _Color * instanceColor; // Multiply main color by instance color return o; } fixed4 frag (v2f i) : SV_Target { UNITY_SETUP_INSTANCE_ID(i); // Setup instance ID in fragment (if needed, e.g., for texture arrays) return i.color; } ENDCG } } }
- This shader shows how to declare an instanced property (
- High Number of Draw Calls: Each time the CPU instructs the GPU to render a set of geometry (a "draw call"), there's overhead involved in preparing and sending that data. Too many individual draw calls can overwhelm the CPU.
- Too Many Unique Materials: Objects with different materials generally cannot be batched together, increasing the number of draw calls. Even slight material variations (e.g., different
_Colorproperty on otherwise identical materials) can break batching. - Physics Calculations:
- Excessive or Complex Colliders: Using Mesh Colliders on complex geometry, or having a very large number of simple colliders, significantly increases the computational cost of physics simulations.
- Over-reliance on Complex Real-time Simulations: Highly detailed or numerous interacting rigidbodies, especially with continuous collision detection, can quickly become a CPU bottleneck.
- High
FixedUpdateRate for Physics Complexity: If yourFixedUpdate(where physics typically runs) is set to a very high frequency for a scene with demanding physics, it can consume a disproportionate amount of CPU time.- Tip: Adjust
Time.fixedDeltaTimein Project Settings -> Time. A smaller value means more frequent physics updates (more CPU).
- Tip: Adjust
- UI Updates:
- Complex UI Hierarchies: Deeply nested UI elements or those with many components that trigger layout recalculations can be very CPU-intensive.
- Frequent Updates to UI Elements: Constantly changing text, image sizes, or positions can force the UI system to recalculate layouts and mesh data every frame, leading to significant CPU overhead.
- Example (Bad): Updating a score text in
Update()without checking if the score changed.public Text scoreText; private int currentScore; void Update() { // Even if 'currentScore' hasn't changed, this assignment causes UI rebuild overhead. scoreText.text = "Score: " + currentScore.ToString(); }
- Example (Bad): Updating a score text in
- Animation:
- Complex Animation Setups: Many animated characters, especially with intricate rigs, inverse kinematics (IK), or blend shapes, can strain the CPU as it calculates bone transformations and skinning.
- Mobile Recommendations:
- Skinned Animated Meshes: For characters visible on screen, aim for 5,000 - 20,000 triangles (10,000 - 40,000 tris) depending on detail level and character importance. Use LODs aggressively.
- Bone Count: Keep the bone count for skinned meshes as low as possible. Generally under 60-75 bones per character is a good target for mobile, though simple characters can be much lower. More bones mean more CPU calculation per vertex.
- Memory Management:
- General High Memory Usage: While not a direct CPU bottleneck in the same way as GC, if your game uses excessive memory, the operating system might resort to swapping data to disk, which is a slow CPU-bound operation and can cause major hitches.
- Navigation (NavMesh) System:
- Prefer Static Over Dynamic NavMeshes:
- Strategy: Whenever possible, bake your NavMesh in the Unity editor during development (
Window > AI > Navigation). This pre-calculates the walkable areas and paths, making pathfinding queries at runtime extremely fast. Dynamic NavMesh generation (e.g., creating a new NavMesh at runtime for a procedurally generated level) is very CPU-intensive and should be avoided or carefully managed. - Benefit: Eliminates costly runtime mesh generation, leading to smoother performance.
- Strategy: Whenever possible, bake your NavMesh in the Unity editor during development (
- Efficient Dynamic Obstacles:
- Strategy: For moving obstacles that temporarily block paths (e.g., a moving platform, a door opening), use the NavMesh Obstacle component instead of trying to regenerate the NavMesh. The obstacle component carves a hole in the NavMesh, allowing agents to avoid it without a full rebuild.
- Benefit: Avoids expensive full NavMesh updates.
- Off-Mesh Links:
- Strategy: Use Off-Mesh Links to connect disconnected areas of your NavMesh, or to allow agents to perform specific actions like jumping over a gap, climbing a ladder, or dropping down a ledge. These are pre-defined connections that agents can traverse even if the NavMesh itself doesn't physically connect.
- Benefit: Enables pathfinding across non-walkable areas without complex NavMesh generation or manual scripting for each unique traversal. They are much more efficient than trying to force agents to navigate these areas through dynamic NavMesh changes.
- Setup: Can be set up manually in the editor or generated automatically based on agent settings (
Window > AI > Navigation > Baketab, underGenerate OffMeshLinks).
- NavMesh Areas and Costs:
- Strategy: Define different NavMesh Areas (e.g., "Walkable," "Jump," "Mud," "Water") and assign different traversal costs to them. Agents will automatically prefer paths through lower-cost areas.
- Benefit: Allows for more intelligent and realistic pathfinding behavior without complex custom pathing logic.
- NavMesh Agent Optimization:
- Strategy: Optimize individual
NavMeshAgentcomponents. Reduce their Update Frequency if precise movement isn't always critical (e.g.,Slowfor distant agents). Increase their Stopping Distance to avoid unnecessary micro-adjustments near targets. - Benefit: Reduces the CPU load from pathfinding calculations for each agent.
- Strategy: Optimize individual
- Prefer Static Over Dynamic NavMeshes:
- Overdraw:
- Rendering Pixels That Are Subsequently Covered: This occurs when the GPU renders pixels for objects that are then immediately hidden by other opaque objects drawn on top. Transparent objects are particularly prone to overdraw as they require drawing both the opaque and transparent layers.
- Example: Many semi-transparent particle effects overlapping in front of each other, or complex UI elements with alpha that frequently redraw.
- Tip: Use the Unity Frame Debugger to visualize overdraw.
- Rendering Pixels That Are Subsequently Covered: This occurs when the GPU renders pixels for objects that are then immediately hidden by other opaque objects drawn on top. Transparent objects are particularly prone to overdraw as they require drawing both the opaque and transparent layers.
- High-Poly Models/Excessive Vertices:
- Models with Too Many Triangles: Using models with an unnecessarily high polygon count for their on-screen size can overwhelm the GPU's vertex processing capabilities, especially on lower-end devices.
- Example: A highly detailed 50,000-triangle character model used as a distant NPC that only occupies a few pixels on screen.
- Mobile Recommendation (Total Triangles per Frame): Aim for a total scene triangle count of under 100,000 - 500,000 triangles per frame for mid-range mobile devices, depending on shader complexity and fill rate. High-end devices might handle up to 1-2 million. Simpler scenes or older devices will require much less.
- Mobile Recommendation (Static Meshes): Optimize static environment meshes to be as low-poly as visually acceptable. Consider techniques like modularity and texture atlases to maximize batching.
- Models with Too Many Triangles: Using models with an unnecessarily high polygon count for their on-screen size can overwhelm the GPU's vertex processing capabilities, especially on lower-end devices.
- Unoptimized Shaders:
- Complex or Many Calculation-Heavy Shaders: Shaders that perform many mathematical operations, complex lighting calculations, or use high-precision floating-point numbers can be very expensive for the GPU to execute per pixel or vertex.
- Example: A custom shader with multiple texture lookups, complex lighting models (e.g., physically-based rendering on mobile), and expensive post-processing steps.
- Too Many Shader Variants: Each variant of a shader (e.g., different lighting models, shadow types) increases the build size and can lead to longer load times as the GPU needs to compile or load more shader permutations.
- Tip: Use Unity's Shader Stripping options and consider simpler shaders (e.g.,
Unlit,Mobile/Diffuse) when possible.
- Tip: Use Unity's Shader Stripping options and consider simpler shaders (e.g.,
- Complex or Many Calculation-Heavy Shaders: Shaders that perform many mathematical operations, complex lighting calculations, or use high-precision floating-point numbers can be very expensive for the GPU to execute per pixel or vertex.
- Real-time Lighting and Shadows:
- Too Many Dynamic Light Sources: Each real-time light source adds significant rendering cost as it requires additional passes to calculate its effect on illuminated objects.
- High-Resolution or Many Dynamic Shadows: Shadows are one of the most computationally expensive rendering features. High-resolution shadow maps or many dynamic shadow casters can severely impact GPU performance.
- Tip: Prefer baked lighting (lightmaps) for static elements whenever possible. Limit the number of real-time lights and lower shadow quality/resolution on less powerful platforms.
- Post-Processing Effects:
- Heavy Use of Expensive Post-Processing Effects: Effects like bloom, ambient occlusion, depth of field, screen-space reflections, or global illumination all add significant overhead as they require rendering to off-screen buffers and performing complex calculations on the entire screen image.
- Tip: Use these effects sparingly, choose lower quality settings, or disable them on low-end hardware.
- Heavy Use of Expensive Post-Processing Effects: Effects like bloom, ambient occlusion, depth of field, screen-space reflections, or global illumination all add significant overhead as they require rendering to off-screen buffers and performing complex calculations on the entire screen image.
- Texture Overload:
- High-Resolution Textures: Using textures with resolutions (e.g., 4096x4096) much higher than necessary for how they appear on screen wastes GPU memory and bandwidth.
- Uncompressed Textures: Using uncompressed textures consumes significantly more GPU memory and bandwidth compared to compressed formats, leading to slower texture uploads and increased memory pressure.
- Tip: Always set appropriate Texture Import Settings (e.g.,
Compressed,Max Size,Mip Maps).
- Tip: Always set appropriate Texture Import Settings (e.g.,
- Lack of Mipmaps: Textures without mipmaps (pre-calculated smaller versions) can cause inefficient texture sampling by the GPU, especially for objects far from the camera.
- Particle Systems:
- Large Numbers of Particles, Complex Particle Shaders, or High Overdraw: Particle systems can be GPU-intensive, particularly if they consist of many particles, use complex custom shaders, or cause significant overdraw (e.g., large, overlapping transparent particles).
- Tip: Optimize particle counts, use simpler shaders (e.g.,
Particle/Standard Unlit), and consider using GPU-instanced particles for very large numbers if applicable.
- Tip: Optimize particle counts, use simpler shaders (e.g.,
- Large Numbers of Particles, Complex Particle Shaders, or High Overdraw: Particle systems can be GPU-intensive, particularly if they consist of many particles, use complex custom shaders, or cause significant overdraw (e.g., large, overlapping transparent particles).
- Culling (Occlusion & Frustum):
- Frustum Culling (Automatic): Unity automatically prevents the GPU from drawing objects that are completely outside the camera's view frustum (the visible cone). This includes objects behind the camera or too far away based on the camera's clipping planes.
- Benefit: Reduces the number of objects the GPU attempts to draw, saving vertex processing and draw calls.
- Tip: Adjust your camera's Far Clipping Plane to exclude objects that are excessively far away and won't be seen by the player, further optimizing what's considered for rendering.
- Occlusion Culling (Manual Setup): This is an optimization you set up in Unity that prevents the GPU from drawing objects that are completely hidden by other opaque objects (e.g., a room behind a wall). It requires baking occlusion data for your scene.
- Benefit: Significantly reduces overdraw and draw calls in complex indoor or urban environments where many objects might be visually blocked. The GPU doesn't waste time drawing pixels that will be covered by closer geometry.
- Setup: Requires marking static geometry as "Occluder Static" and "Occludee Static" and baking an occlusion culling data set (
Window > Rendering > Occlusion Culling). This creates data that the engine uses at runtime to determine what is visible.
- Frustum Culling (Automatic): Unity automatically prevents the GPU from drawing objects that are completely outside the camera's view frustum (the visible cone). This includes objects behind the camera or too far away based on the camera's clipping planes.
- Large Unoptimized Assets:
- High-resolution textures, uncompressed audio files, and excessively complex 3D models directly consume vast amounts of RAM.
- Memory Leaks:
- Occur when objects are no longer needed by the game but are still referenced by other parts of the code, preventing the garbage collector from freeing that memory. This leads to steadily increasing memory usage over time, eventually crashes or severe performance degradation.
- Example: Subscribing to an event but never unsubscribing when the object is destroyed, leaving a reference to the destroyed object.
- Frequent Instantiation/Destruction:
- As mentioned with GC, constantly creating and destroying objects not only triggers the garbage collector but can also lead to memory fragmentation, making it harder for the system to find large contiguous blocks of memory.
- Read/Write Enabled Textures/Meshes:
- If you enable the "Read/Write Enabled" option for textures or meshes in their import settings, Unity keeps a copy of the asset in both CPU (system) memory and GPU (video) memory. This effectively doubles their memory footprint, consuming unnecessary resources if you don't actually need to read or modify their data at runtime (e.g., procedurally modify mesh data, or
GetPixelfrom a texture).- Tip: Disable "Read/Write Enabled" unless specifically required by your code.
- If you enable the "Read/Write Enabled" option for textures or meshes in their import settings, Unity keeps a copy of the asset in both CPU (system) memory and GPU (video) memory. This effectively doubles their memory footprint, consuming unnecessary resources if you don't actually need to read or modify their data at runtime (e.g., procedurally modify mesh data, or
- Synchronous Loading:
- Loading large assets or entire scenes all at once (synchronously) will block the main thread and freeze the game until the loading is complete, resulting in noticeable pauses or long loading screens.
- Example (Bad):
// This will freeze your game until "HeavyScene" is loaded SceneManager.LoadScene("HeavyScene");
- Example (Good):
// Loads asynchronously without blocking the main thread StartCoroutine(LoadYourAsyncScene("HeavyScene")); IEnumerator LoadYourAsyncScene(string sceneName) { AsyncOperation asyncLoad = SceneManager.LoadSceneAsync(sceneName); // Wait until the asynchronous scene fully loads while (!asyncLoad.isDone) { yield return null; } }
- Unused Assets:
- Not properly unloading assets from memory when they are no longer needed (e.g., after leaving a level) can lead to unnecessary memory consumption, potentially causing memory issues later in the game session.
- Tip: Use
Resources.UnloadUnusedAssets()strategically, often after loading a new scene, but be aware it can also cause a hitch.
Here's a detailed checklist of strategies to minimize or eliminate GC spikes and ensure smooth gameplay:
- Pre-allocate and Reuse (Object Pooling):
- Strategy: Instead of using
Instantiate()andDestroy()for frequently created objects like bullets, enemies, UI elements, visual effects, or even complex custom class instances, create a fixed number of them at startup or during a loading screen. Keep them in a "pool" (e.g., aListorQueue). When you need an object, retrieve an inactive one from the pool, activate it, and set its properties. When you're done with it, deactivate it and return it to the pool for later reuse. - Impact: Drastically reduces
newallocations and GC overhead for these objects. - Example (Simple Bullet Pool):
// Basic Bullet Pooling System public GameObject bulletPrefab; public int poolSize = 10; private Queue<GameObject> bulletPool = new Queue<GameObject>(); void Awake() { for (int i = 0; i < poolSize; i++) { GameObject bullet = Instantiate(bulletPrefab); bullet.SetActive(false); // Start inactive bulletPool.Enqueue(bullet); } } public GameObject GetBullet() { if (bulletPool.Count > 0) { GameObject bullet = bulletPool.Dequeue(); bullet.SetActive(true); return bullet; } // Optionally instantiate new if pool runs out, but this causes GC Debug.LogWarning("Bullet pool exhausted! Instantiating new bullet."); return Instantiate(bulletPrefab); } public void ReturnBullet(GameObject bullet) { bullet.SetActive(false); bulletPool.Enqueue(bullet); } // Example Usage void Update() { if (Input.GetButtonDown("Fire1")) { GameObject newBullet = GetBullet(); newBullet.transform.position = transform.position; newBullet.transform.rotation = transform.rotation; // Add component to handle bullet logic and call ReturnBullet when done } }
- Strategy: Instead of using
- Cache References:
- Strategy: Avoid calling
GetComponent(),FindObjectOfType(),GameObject.Find(), orCamera.mainwithinUpdate(),LateUpdate(),FixedUpdate(), or any method called repeatedly per frame. These operations are computationally expensive and can sometimes cause hidden allocations. Get the component or object reference once inAwake()orStart()and store it in a private field for quick access. - Example (Caching
Camera.main):private Camera _mainCamera; void Awake() { _mainCamera = Camera.main; // Cache once } void Update() { Vector3 mousePos = _mainCamera.ScreenToWorldPoint(Input.mousePosition); // ... }
- Strategy: Avoid calling
- Minimize String Manipulations:
- Strategy: String concatenation (
+),string.Format(), andToString()methods all create newstringobjects on the heap. If you're frequently building UI text or logging messages, use aStringBuilderfor efficient string construction, or only update text when the underlying value actually changes. - Example (Using
StringBuilderfor UI):using UnityEngine.UI; using System.Text; // Required for StringBuilder public Text statusText; private StringBuilder _stringBuilder = new StringBuilder(100); // Pre-allocate capacity private int _playerHealth = 100; private int _playerScore = 0; void UpdateStatusUI() { _stringBuilder.Clear(); // Clear previous content _stringBuilder.Append("Health: ").Append(_playerHealth).Append("\n"); _stringBuilder.Append("Score: ").Append(_playerScore); statusText.text = _stringBuilder.ToString(); // Only one ToString() call } void Update() { // Only update the UI if health or score changes, to avoid constant allocations if (_playerHealth != _lastHealth || _playerScore != _lastScore) { UpdateStatusUI(); _lastHealth = _playerHealth; _lastScore = _playerScore; } } private int _lastHealth = -1, _lastScore = -1; // Track last updated values
- Strategy: String concatenation (
- Avoid LINQ in Hot Paths:
- Strategy: While convenient, many LINQ (Language Integrated Query) operations (e.g.,
.Where(),.Select(),.ToList()) often create temporary enumerators and other objects, leading to allocations. In performance-critical sections of code, prefer traditionalfororforeachloops. - Example (Bad - LINQ in Update):
public List<Enemy> allEnemies; void Update() { // This creates temporary objects for the IEnumerable and potentially the List List<Enemy> activeEnemies = allEnemies.Where(e => e.IsActive).ToList(); foreach (var enemy in activeEnemies) { enemy.Move(); } }
- Example (Good - Traditional Loop):
public List<Enemy> allEnemies; void Update() { // No new collections or enumerators allocated here for (int i = 0; i < allEnemies.Count; i++) { if (allEnemies[i].IsActive) { allEnemies[i].Move(); } } }
- Strategy: While convenient, many LINQ (Language Integrated Query) operations (e.g.,
- Don't Allocate Arrays/Lists Every Frame:
- Strategy: Avoid creating new arrays or lists inside methods called every frame. For physics queries, always use the
NonAllocversions (e.g.,Physics.OverlapSphereNonAlloc,Physics.RaycastNonAlloc,Physics2D.OverlapAreaNonAlloc). These methods take a pre-allocated array as an argument and fill it, avoiding new allocations. - Example (
Physics.OverlapSphereNonAlloc):private Collider[] _hitBuffer = new Collider[20]; // Pre-allocate once void Update() { int numHits = Physics.OverlapSphereNonAlloc(transform.position, 5f, _hitBuffer); for (int i = 0; i < numHits; i++) { // Process _hitBuffer[i] } }
- Strategy: Avoid creating new arrays or lists inside methods called every frame. For physics queries, always use the
-
foreachLoop Allocations (Contextual):- Strategy: In older Unity versions (pre-C# 7.3) or specific circumstances,
foreachloops over value-type collections (e.g.,List<int>) or custom structs could generate garbage due to boxing. Modern Unity (especially with IL2CPP backend) has largely optimized this, soforeachoverList<T>is generally safe now. Still, if you're on a very old project, or profiling reveals this, consider aforloop.
- Strategy: In older Unity versions (pre-C# 7.3) or specific circumstances,
- Cache
WaitForSeconds,WaitForEndOfFrame, etc.:- Strategy:
yield return new WaitForSeconds(X);oryield return new WaitForEndOfFrame();inside a loop or a frequently running coroutine creates a new object every time. Instead, declare theseWaitUntilorWaitForobjects asstatic readonlyfields once and reuse them. - Example (Caching
WaitForSeconds):// Cache these objects once globally for reuse across coroutines private static readonly WaitForSeconds _waitOneSecond = new WaitForSeconds(1.0f); private static readonly WaitForEndOfFrame _endOfFrame = new WaitForEndOfFrame(); private static readonly WaitForFixedUpdate _fixedUpdate = new WaitForFixedUpdate(); IEnumerator MySmoothCoroutine() { Debug.Log("Starting smooth coroutine..."); while (true) { // No new allocations for these WaitFor objects each time they are yielded yield return _waitOneSecond; // Reuses the cached WaitForSeconds object Debug.Log("One second passed."); yield return _endOfFrame; // Reuses the cached WaitForEndOfFrame object Debug.Log("End of frame."); yield return _fixedUpdate; // Reuses the cached WaitForFixedUpdate object Debug.Log("Fixed update."); } } // Example usage: void Start() { StartCoroutine(MySmoothCoroutine()); }
- Strategy:
- Use Generic Collections:
- Strategy: Always prefer generic collections (
List<T>,Dictionary<TKey, TValue>,HashSet<T>) over their non-generic counterparts (ArrayList,Hashtable). Non-generic collections store elements asobject, which forces value types (likeint,float,structs) to be "boxed" (wrapped in a new heap object) when added and "unboxed" (extracted) when retrieved, creating garbage. - Example (Bad - Boxing with ArrayList):
ArrayList myList = new ArrayList(); myList.Add(10); // int (value type) is boxed into an object myList.Add(20); int sum = 0; foreach (int i in myList) { // int is unboxed here sum += i; }
- Example (Good - No Boxing with List):
List<int> myList = new List<int>(); myList.Add(10); // No boxing myList.Add(20); int sum = 0; foreach (int i in myList) { // No unboxing sum += i; }
- Strategy: Always prefer generic collections (
- Avoid Unnecessary Boxing with Value Types:
- Strategy: Be cautious when passing value types to methods that expect
objector when performing operations that implicitly box. - Example (Implicit boxing with
string.Formatin some contexts):int health = 50; // String.Format can sometimes cause boxing if not optimized by runtime/compiler // It's generally better than + for complex strings but be aware. // Less likely to cause explicit boxing in modern C# / Unity for simple cases. string message = string.Format("Player health: {0}", health);
- Example (Generally safer approach, often more readable and optimized):
int health = 50; string message = $"Player health: {health}"; // String interpolation, often optimized
- Strategy: Be cautious when passing value types to methods that expect
- Update Text Only When Necessary:
- Strategy: Don't update
myText.textevery frame if the value displayed hasn't changed. Implement logic to check if the new value is different from the old one before assigning the text. (SeeStringBuilderexample under "Minimize String Manipulations" above for combined approach).
- Strategy: Don't update
- Disable Layout Rebuilders:
- Strategy: UnityUI's layout system can be CPU-intensive. If your UI elements' sizes or positions are static, you can disable components like
ContentSizeFitter,HorizontalLayoutGroup,VerticalLayoutGroup, or even the entireCanvas'sCanvasScalerorGraphicRaycasterif not needed, to prevent unnecessary recalculations.
- Strategy: UnityUI's layout system can be CPU-intensive. If your UI elements' sizes or positions are static, you can disable components like
- Ensure UI Elements Batch Effectively:
- Strategy: UI elements on the same canvas using the same material (e.g., same atlas texture for images, or same font for text) can be batched together by Unity's UI system, reducing draw calls. Breaking batching (e.g., by inserting elements with different materials or a
Canvaswithin anotherCanvas) will increase draw calls and potentially CPU overhead.
- Strategy: UI elements on the same canvas using the same material (e.g., same atlas texture for images, or same font for text) can be batched together by Unity's UI system, reducing draw calls. Breaking batching (e.g., by inserting elements with different materials or a
- Use the Unity Profiler:
- Location: Access it via
Window > Analysis > Profiler. - Key Feature: Enable the "Record Allocations" setting (the little "GC" icon at the top of the CPU Usage module).
- Identification: Look for noticeable spikes in the "GC Alloc" graph.
- Diagnosis: When a spike occurs, click on the corresponding frame. Then, in the lower panel of the CPU Usage module, expand the "GC Alloc" section. This will show you a detailed call stack, pinpointing exactly which methods and lines of code are generating the most garbage. This is crucial for knowing where to focus your optimization efforts.
- Other Useful Modules:
- CPU Usage: Identify expensive methods, physics, animation, and rendering overhead.
- GPU Usage: Find bottlenecks related to shaders, overdraw, and draw calls.
- Memory: Track total memory usage, find asset memory hogs, and detect leaks.
- Rendering: Examine draw calls, batches, and triangle/vertex counts.
- Physics: Analyze physics step times and collider costs.
- Location: Access it via
- Enable Incremental Garbage Collection:
- Location: Go to
Project Settings > Player > Other Settings > Optimization > Use Incremental GC. - Strategy: When enabled, Unity attempts to spread the work of garbage collection over multiple frames by performing smaller, more frequent collections. This reduces the duration of any single GC pause, making "hiccups" less noticeable, even if the total time spent on GC might slightly increase. This is generally recommended for smoother gameplay and is often enabled by default in modern Unity versions. Only disable it if specific profiling reveals it's causing new issues in a very particular scenario.
- Location: Go to
- General Performance and Optimization:
- Modeling & Asset Optimization:
- UI Performance:
- Rendering & Graphics Performance:
- Performance impact of shadow cascades
- Performance tips for trees
- Graphics performance and profiling
- Graphics performance and profiling in URP
- Configure for better performance in URP
- Adjust settings to improve performance in URP
- Graphics performance and profiling in the Built-In Render Pipeline
- Graphics performance and profiling reference
- Usage and Performance of Built-in Shaders
- Profiling & Data Collection:
- Collect performance data
- Collect performance data introduction
- Collect performance data in Play mode
- Collect performance data on a target platform
- Collecting performance data on an Android device
- Collecting performance data on an iOS device (Note: While titled "Device Simulator for iOS", it contains a section on collecting performance data on iOS devices)
- CPU performance data
- Memory performance data
- Measure performance with the built-in profiler (Note: This refers to an older, deprecated internal profiler on iOS, but the page still contains general profiling insights)
- Platform-Specific Optimizations: