Chapter
1.2. Embedded Systems in Society and Industry
1.3. Embedded Computing Trends
1.4. Embedded Systems: Prototyping and Production
1.5. About LARA: An Aspect-Oriented Approach
1.6. Objectives and Target Audience
1.7. Complementary Bibliography
1.8. Dependences in Terms of Knowledge
1.9. Examples and Benchmarks
Chapter 2: High-performance embedded computing
2.2. Target Architectures
2.2.1. Hardware Accelerators as Coprocessors
2.2.2. Multiprocessor and Multicore Architectures
2.2.3. Heterogeneous Multiprocessor/Multicore Architectures
2.2.4. OpenCL Platform Model
2.3. Core-Based Architectural Enhancements
2.3.1. Single Instruction, Multiple Data Units
2.3.2. Fused Multiply-Add Units
2.3.3. Multithreading Support
2.4. Common Hardware Accelerators
2.4.2. Reconfigurable Hardware Accelerators
2.4.3. SoCs With Reconfigurable Hardware
2.5.2. The Roofline Model
2.5.3. Worst-Case Execution Time Analysis
2.6. Power and Energy Consumption
2.6.1. Dynamic Power Management
2.6.2. Dynamic Voltage and Frequency Scaling
Chapter 3: Controlling the design and development cycle
3.2. Specifications in MATLAB and C: Prototyping and Development
3.2.1. Abstraction Levels
3.2.2. Dealing With Different Concerns
3.2.3. Dealing With Generic Code
3.2.4. Dealing With Multiple Targets
3.3. Translation, Compilation, and Synthesis Design flows
3.4. Hardware/Software Partitioning
3.4.1. Static Partitioning
3.4.2. Dynamic Partitioning
3.5. LARA: a language for Specifying Strategies
3.5.3. Exec and Def Actions
3.5.5. Executing External Tools
3.5.6. Compilation and Synthesis Strategies in LARA
Chapter 4: Source code analysis and instrumentation
4.2. Analysis and Metrics
4.3. Static Source Code Analysis
4.4. Dynamic Analysis: The Need for Instrumentation
4.4.1. Information From Profiling
4.5. Custom Profiling Examples
4.5.3. Dynamic Call Graphs
4.5.4. Branch Frequencies
Chapter 5: Source code transformations and optimizations
5.2. Basic Transformations
5.3. Data Type Conversions
5.6. Loop-Based Transformations
5.6.4. Loop Fusion and Loop Fission
5.6.5. Loop Interchange and Loop Permutation (Loop Reordering)
5.6.11. Loop Tiling (Loop Blocking)
5.6.16. Software Pipelining
5.6.17. Evaluator-Executor Transformation
5.6.19. Other Loop Transformations
5.7. Function-Based Transformations
5.7.1. Function Inlining/Outlining
5.7.2. Partial Evaluation and Code Specialization
5.7.3. Function Approximation
5.8. Data structure-Based Transformations
5.8.1. Scalar Expansion, Array Contraction, and Array Scalarization
5.8.2. Scalar and Array Renaming
5.8.3. Arrays and Records
5.8.4. Reducing the Number of Dimensions of Arrays
5.8.5. From Arrays to Pointers and Array Recovery
5.8.7. Representation of Matrices and Graphs
5.8.9. Data Layout Transformations
5.8.10. Data Replication and Data Distribution
5.9. From Recursion to Iterations
5.10. From Nonstreaming to Streaming
5.11. Data and Computation Partitioning
5.11.1. Data Partitioning
5.11.2. Partitioning Computations
5.11.3. Computation Offloading
Chapter 6: Code retargeting for CPU-based platforms
6.2. Retargeting Mechanisms
6.3. Parallelism and Compiler Options
6.3.1. Parallel Execution Opportunities
6.3.3. Compiler Phase Selection and Ordering
6.5. Shared Memory (Multicore)
6.6. Distributed Memory (Multiprocessor)
6.7. Cache-based Program Optimizations
6.8.1. Capturing Heuristics to Control Code Transformations
6.8.2. Parallelizing Code With OpenMP
6.8.3. Monitoring an MPI Application
Chapter 7: Targeting heterogeneous computing platforms
7.2. Roofline Model Revisited
7.3. Workload Distribution
7.4. Graphics Processing Units
7.5. High-level Synthesis
Chapter 8: Additional topics
8.2. Design Space Exploration
8.2.1. Single-Objective Optimization and Single/Multiple Criteria
8.2.2. Multiobjective Optimization, Pareto Optimal Solutions
8.3. Hardware/Software Codesign
8.4. Runtime Adaptability
8.4.1. Tuning Application Parameters
8.4.2. Adaptive Algorithms
8.4.3. Resource Adaptivity
8.5. Automatic Tuning (Autotuning)
8.5.2. Static and Dynamic Autotuning
8.5.3. Models for Autotuning
8.5.4. Autotuning Without Dynamic Compilation
8.5.5. Autotuning With Dynamic Compilation
8.6. Using LARA for Exploration of Code Transformation Strategies