Basic use of Intel Software Development Emulator (SDE)


Reading time: 20 minutes | Coding time: 5 minutes

Intel Software Development Emulator (SDE) is used to run a given program on a specific instruction set architecture and capture various performance details like generated instructions (using MIX tool) and much more.

In this article, we will learn to use Intel SDE using a basic C++ code and capture the generated instructions using MIX tool which is an in-built tool in Intel SDE.

The first step is to install Intel SDE on your platform. To do this, follow the following steps for Linux:

As it is a protected download, we have to agree to the license and then download manually. This step has not be automated.

You will have the file as:

sde-external-8.35.0-2019-03-11-lin.tar.bz2

Extract the binary as:

tar -xjf sde-external-8.35.0-2019-03-11-lin.tar.bz2

A new folder named sde-external-8.35.0-2019-03-11-lin will be created. Exact name may differ depending upon your version.

Following it, go inside the folder and let us explore its content.

cd sde-external-8.35.0-2019-03-11-lin
ls -a

Output:

ia32     Licenses     misc        sde    src  xed64
intel64  LICENSE.txt  README.txt  sde64  xed

It has internal libraries for emulating ia32 and intel64. Add SDE to your system path environment variable PATH:

export PATH=/path/sde-external-8.16.0-2018-01-30-lin/:$PATH

Check help for SDE as follows:

sde -help

Part of the Output looks as before:

Intel(R) Software Development Emulator.  Version:  8.35.0 external
Copyright (C) 2008-2016, Intel Corporation. All rights reserved.

 Usage: sde [args] -- application [application-args]

 For the longer tool help, use "-help-long".

 If one of "-mix", "-debugtrace", or "-t toolname" are not,
 supplied, then just the underlying emulator will run.

     -mix                Run mix histogram tool
     -omix               Set the output file name for mix, Implies -mix
                         Default is "mix.out"

     -footprint          Run footprint tool
     -ofootprint         Set the output file name for footprint,
                         Implies -footprint. Default is "footprint.out"

     -debugtrace         Run mix debugtrace tool
     -odebugtrace        Set the output file name for debugtrace,
                         Implies -debugtrace
                         Default is "debugtrace.out"

     -ast                Run the Intel(R) AVX/SSE transition checker
     -oast               Set the output file name for the Intel AVX/SSE
                         transition checker. Implies -ast
                         Default is "avx-sse-transition.out"

     -quark              Set chip-check and CPUID for Intel(R) Quark CPU
     -p4                 Set chip-check and CPUID for Intel(R) Pentium4 CPU

In this guide, we will use the mix tool.

We need to take a code to run which we will analyze. For this, we will use a simple C++ code with a for loop with 5 iterations. The code (in a file named "code.cpp" is as follows:

#include <iostream>
int main() 
{
	int i = 0;
	for(i=0; i<5; i++)
	    std::cout << i << " ";
}

To run it, use the following commands:

g++ -std=c++11 code.cpp
./a.out

Output:

0 1 2 3 4

Run on different architecture

Now, suppose you have a old computer using Intel Broadwell CPU but you want to run it on Intel Skylake CPU. We can do this using SDE. The syntax for this is:

sde -<platform_code> -- [command]

You will find the platform code in the help. For our case, we want to run it on Skylake (code: skx) and use the MIX tool. The command for this will be:

sde -skx -- ./a.out

Output will be same:

0 1 2 3 4

If we want to run it on CascadeLake, use the following command:

sde -ckx -- ./a.out

Output will be same:

0 1 2 3 4

There are different platforms some of which are:

  • SkyLake: skx
  • CascadeLake: ckx
  • Quark CPU: quark
  • Future Intel chip: future
  • Knights mill CPU: knm

Run using a tool

While running it on a platform, we can add a tool as well. One such tool is MIX tool which will capture the generated instructions which will be useful to study the difference in execution on different platforms.

The syntax of adding a tool is as follows:

sde -<platform_code> -tool -- [command]

There are different tools like:

  • MIX tool
  • Footprint tool
  • debugtrace tool
  • ast transition checker tool

In our case, we want to run the command on Skylake (skx) and using the MIX tool (-mix). The command will be as follows:

sde -skx -mix -- ./a.out

This will generate a new file named sde-mix-out.txt which will have the output of the MIX tool.

It is a large file and the starting few lines are as follows:

# Mix output version 10
# Intel(R) SDE version: 8.35.0 external
# Starting tid 0,  OS-TID 40701
# FINI: end of program

# EMIT_IMAGE_ADDRESSES
#
#    IMAGE NAME                                                                              LOW ADDRESS   HIGH ADDRESS
#
/home/amd/Desktop/aditya/sde/sde-external-8.35.0-2019-03-11-lin/a.out                        000000400000  0000004009db
/lib64/ld-linux-x86-64.so.2                                                                  7f06c2f38000  7f06c2f5993f
[vdso]                                                                                       7fffa195d000  7fffa195de6c
/lib64/libstdc++.so.6                                                                        7f06adbc1000  7f06adec735f
/lib64/libm.so.6                                                                             7f06ad845000  7f06adb46137
/lib64/libgcc_s.so.1                                                                         7f06ad625000  7f06ad83a3ff
/lib64/libc.so.6                                                                             7f06ad257000  7f06ad6241df
# END_IMAGE_ADDRESSES

# ==============================================
# STATS FOR TID 0  OS-TID  40701 EMIT# 1
# ==============================================
# EMIT_TOP_BLOCK_STATS FOR TID 0  OS-TID 40701 EMIT # 1 EVENT=ICOUNT
BLOCK:     0   PC: 00007f06c2f41f98   ICOUNT:    401832   EXECUTIONS:     50229   #BYTES: 24   %:  31.1   cumltv%:  31.1  FN: _dl_lookup_symbol_x  IMG: /lib64/ld-linux-x86-64.so.2  OFFSET: 9f98
XDIS 00007f06c2f41f98: BASE 4C89F1                   mov rcx, r14
XDIS 00007f06c2f41f9b: BASE 4883C201                 add rdx, 0x1
XDIS 00007f06c2f41f9f: BASE 48C1E105                 shl rcx, 0x5
XDIS 00007f06c2f41fa3: BASE 4901CE                   add r14, rcx
XDIS 00007f06c2f41fa6: BASE 4901C6                   add r14, rax

The summary of the file is at the end. It is the count of each instruction. It is as follows:

# END_STATIC_STATS
# EMIT_GLOBAL_DYNAMIC_STATS   EMIT# 1
#
# $global-dynamic-counts
#
#       opcode                 count
#
*stack-read                                                        88206
*stack-write                                                       94189
*iprel-read                                                        10362
*iprel-write                                                        3192
*mem-read-1                                                       108318
*mem-read-2                                                         5363
*mem-read-4                                                        40478
*mem-read-8                                                       160505
*mem-read-16                                                         135
*mem-write-1                                                        6006
*mem-write-2                                                          92
*mem-write-4                                                       17210
*mem-write-8                                                       96837
*mem-write-16                                                         17
*mem-write-56                                                         77
*mem-read                                                         314876
*mem-write                                                        120239
*mem                                                              431262
*isa-ext-BASE                                                    1289406
*isa-ext-LONGMODE                                                   2921
*isa-ext-SSE                                                          17
*isa-ext-SSE2                                                        386
*isa-ext-SSE3                                                          8
*isa-ext-SSE4                                                         16
*isa-ext-SSE4A                                                        77
*isa-ext-SSSE3                                                         8
*isa-ext-XSAVE                                                        78
*isa-ext-XSAVEC                                                       77
*isa-set-CMOV                                                       2163
*isa-set-FAT_NOP                                                    6615
*isa-set-I186                                                      59846
*isa-set-I386                                                      74493
*isa-set-I486REAL                                                     53
*isa-set-I86                                                     1146227
*isa-set-LONGMODE                                                   2921
*isa-set-PENTIUMREAL                                                   9
*isa-set-SSE                                                          17
*isa-set-SSE2                                                        386
*isa-set-SSE3                                                          8
*isa-set-SSE4                                                         16
*isa-set-SSSE3                                                         8
*isa-set-XSAVE                                                        78
*isa-set-XSAVEC                                                       77
*category-BINARY                                                  308809
*category-BITBYTE                                                   1404
*category-CALL                                                      8211
*category-CMOV                                                      2163
*category-COND_BR                                                 199661
*category-CONVERT                                                     35
*category-DATAXFER                                                421085
*category-LOGICAL                                                 162419
*category-MISC                                                     31371
*category-NOP                                                         27
*category-POP                                                      25409
*category-PUSH                                                     25655
*category-RET                                                       8207
*category-ROTATE                                                      51
*category-SEMAPHORE                                                   25
*category-SETCC                                                     5918
*category-SHIFT                                                    72515
*category-SSE                                                        199
*category-STRINGOP                                                  5610
*category-SYSCALL                                                     80
*category-SYSTEM                                                       9
*category-UNCOND_BR                                                 7284
*category-WIDENOP                                                   6615
*category-XSAVE                                                      155
*ilen-1                                                            28831
*ilen-2                                                           321305
*ilen-3                                                           419435
*ilen-4                                                           243809
*ilen-5                                                            78386
*ilen-6                                                            89976
*ilen-7                                                            79250
*ilen-8                                                            21368
*ilen-9                                                             6027
*ilen-10                                                             288
*ilen-11                                                            4230
*ilen-12                                                              12
*nop-ilen-1                                                           18
*nop-ilen-2                                                            9
*nop-ilen-3                                                          208
*nop-ilen-4                                                           62
*nop-ilen-5                                                           30
*nop-ilen-6                                                         1353
*nop-ilen-7                                                         1337
*nop-ilen-8                                                         3550
*nop-ilen-9                                                            8
*nop-ilen-10                                                          67
*scalar-simd                                                           2
*sse-scalar                                                            2
*sse-packed                                                          433
*gpr64                                                            735707
*legacy-prefixes-0                                                549469
*legacy-prefixes-1                                                742674
*legacy-prefixes-2                                                   774
*legacy-prefixes-and-escapes-0                                    408895
*legacy-prefixes-and-escapes-1                                    871048
*legacy-prefixes-and-escapes-2                                     12942
*legacy-prefixes-and-escapes-3                                        30
*legacy-prefixes-and-escapes-4                                         2
*segment_prefix                                                      505
*rep_prefix                                                         5610
*66_prefix                                                          3387
*rex_prefix                                                       734720
*rexw_prefix                                                      660561
*rexr_prefix                                                      142599
*rexx_prefix                                                        2796
*rexb_prefix                                                      222778
*loadop                                                           146486
*one-memops                                                       437559
*two-memops                                                          318
*cond-branch-forward                                              108842
*cond-branch-backward                                              90819
*disp_only                                                           363
*index_disp                                                           20
*base_only                                                        193858
*base_disp                                                        220932
*base_index                                                        16954
*base_index_disp                                                    5750
*scale_1                                                          438994
*scale_2                                                           10184
*scale_4                                                            5397
*scale_8                                                           14640
*memdisp8                                                         135427
*memdisp32                                                        108032
*elements_i1_128                                                     125
*elements_i8_16                                                       86
*elements_i32_1                                                       78
*elements_i32_4                                                       19
*elements_i64_2                                                        8
*elements_i128_1                                                       8
*convert_i32_1                                                        34
*convert_i64_1                                                         1
*dataxfer_i32_1                                                        1
*dataxfer_i32_4                                                       91
*dataxfer_fp_single_4                                                 17
*dataxfer_fp_double_1                                                  2
*PL-prefixed-UNCOND-BR                                                77
ADD                                                               187340
AND                                                                19786
BSF                                                                   73
BSR                                                                    2
BT                                                                  1329
CALL_NEAR                                                           8211
CDQE                                                                  34
CMOVB                                                                625
CMOVBE                                                                21
CMOVNBE                                                               15
CMOVNS                                                               128
CMOVNZ                                                                45
CMOVZ                                                               1329
CMP                                                                76504
CMPXCHG                                                               25
CPUID                                                                 28
CQO                                                                    1
DEC                                                                  173
DIV                                                                 1660
IDIV                                                                   1
IMUL                                                                 112
INC                                                                34837
JB                                                                   191
JBE                                                                 4114
JLE                                                                  108
JMP                                                                 7284
JNB                                                                 1807
JNBE                                                                9377
JNL                                                                   41
JNLE                                                                1331
JNS                                                                   94
JNZ                                                               115680
JS                                                                    72
JZ                                                                 66846
LDDQU                                                                  8
LEA                                                                31338
LEAVE                                                                  5
MOV                                                               351381
MOVAPS                                                                17
MOVD                                                                   1
MOVDQA                                                                18
MOVDQU                                                                73
MOVSD_XMM                                                              2
MOVSX                                                                547
MOVSXD                                                              2408
MOVZX                                                              66604
MUL                                                                   12
NEG                                                                  181
NOP                                                                 6642
NOT                                                                   20
OR                                                                  2599
PALIGNR                                                                8
PCMPEQB                                                               86
PMOVMSKB                                                              78
POP                                                                25409
PSHUFD                                                                 1
PSLLDQ                                                                 8
PSUBB                                                                  8
PTEST                                                                 16
PUNPCKLBW                                                              2
PUSH                                                               25655
PXOR                                                                 109
RDTSC                                                                  9
REP_STOSB                                                           5192
REP_STOSD                                                             20
REP_STOSQ                                                            398
RET_NEAR                                                            8207
ROL                                                                   46
ROR                                                                    5
SAR                                                                 1381
SBB                                                                   61
SETNZ                                                                 96
SETZ                                                                5822
SHL                                                                51132
SHR                                                                20002
SUB                                                                 7928
SYSCALL                                                               80
TEST                                                              122320
XCHG                                                                  34
XGETBV                                                                 1
XOR                                                                17569
XRSTOR                                                                77
XSAVEC                                                                77
*total                                                           1292917

# END_GLOBAL_DYNAMIC_STATS

This gives you a basic idea of how to use Intel SDE. We would suggest you to try the other platform options and tools and explore the output.

Happy learning.