Programming in R
COURSE NOTES 2 Hoganson
Language Translation
Dr. Ken Hoganson, © August 2014
Language Translation
• Computer is all 0s and 1s, which is hard for
humans.
• So we have created languages that are easier
for us (humans) to work with.
• Human-friendly languages require computer
time to translate into computer-executable
programs.
• Ongoing trend since computer was created:
make better human interfaces to the machine,
using the ever increasing power of the computer
to do the translation work “behind the scenes”.
• Thing GUI interfaces and virtual-reality interfaces.
Dr. Ken Hoganson, © August 2014
Machine Code Just a taste
Machine code – the bottom line in
programming.
Machine code instructions are divided
into fields, and the instruction has a
specified format.
Simple example:
Dr. Ken Hoganson, © August 2014
Machine Code
• This instruction has four fields:
–
–
–
–
Instruction type (two bits)
Operation code (6 bits)
Register operand 1 (4 bits)
Register operand 2 (4 bits)
• 16-bit (two-byte) instruction
Dr. Ken Hoganson, © August 2014
Machine Code
• Two bits for instruction type. How many
types of instructions are possible within this
format?
• Operation Code is 6 bits. How many types
of operations are possible for a format?
• The register operands are 4 bits each. How
many different registers can be indicated
with 4 bits? (similar to addressing)
Dr. Ken Hoganson, © August 2014
Machine Code
• This instruction format is a Register-Register
instruction.
• That means that it takes its inputs from two
register operands.
• The operation is performed on those two
data elements, and the result goes back
into the register specified by the first register
operand.
Dr. Ken Hoganson, © August 2014
Machine Code Instruction
• Machine code is not hard, just painful
and slow to work with.
–
–
–
–
Register-Register instruction format is ‘00’
Op Code to add two registers is ‘010000’
Add contents of register 2 specify ‘0010’
Add contents of register 4 specify ‘0100’
• Complete instruction in 0s and 1s:
• 00 010000 0010 0100
Do you remember
where the result of
the addition is
stored?
Dr. Ken
Hoganson, © August 2014
Assembly Language
• Working with 0s and 1s is hard – and humans are
prone to making errors.
• Languages have been created to make
programming easier.
• Assembly language is the lowest level language.
– Uses mnemonics and abbreviations.
• Our add two register instruction:
– 00 010000 0010 0100
• Can be represented (1 to 1) with an assembly
instruction:
– ADR R2 R4
– ADd Registers R2 and R4, result in R2
Dr. Ken Hoganson, © August 2014
High-Level Languages
• Assembly language is a big
improvement over machine code.
• Assembly is translated by an assembler
program to 0s and 1s that the computer
can work with.
• More powerful (and human-readable)
languages have been created (which
must also be translated to 0s and 1s).
• These are called High Level Languages
• Basic, Fortran, C, C++, C#, R, etc.
Dr. Ken Hoganson, © August 2014
High-Level Languages
• Our add two register instruction:
– 00 010000 0010 0100
• In assembly language:
– ADR R2 R4
– ADd Registers R2 and R4, result in R2
• In a high level language might look like:
– Number1 = Number1 + Number2
– Better?
Dr. Ken Hoganson, © August 2014
Many-to-1 translation
• ADR R2 R4
• High level language might look like:
– Sum = Number1 + Number2
• But this high-level language has another type of
translation embedded: memory addressing
– Number1, Number2, and SUM are data values
stored in memory, not registers.
– The values for Number1 and Number 2 must be
first loaded from memory into registers.
– Then the add operation can be performed
– Then the result stored back to memory in SUM.
• Additional machine-level instructions needed to
do this one high-level language instruction
Dr. Ken Hoganson, © August 2014
High-level Language
Translation
• High-level language instructions must be
translated/converted to machine code before
the computer can run them.
• This process requires a translation program:
– Compiler
– Interpreter
– (Assembler was used for assembly language)
• Languages like C, C++, Cobol, Fortran and
Pascal are all compiled languages.
Dr. Ken Hoganson, © August 2014
Compiler
• Compiler takes the high-level language
program (as text) as its input.
• It produces the machine code version of
the program as its output.
• It does not change the high-level program,
the machine code program is a new
file.
Dr. Ken Hoganson, © August 2014
Interpreter
Some languages like BASIC
and VisualBASIC are
interpreted languages, not
compiled.
The Interpreter does not
convert the entire program
all at once.
Instead, it converts
instructions one at a time,
and has the computer
execute each instruction.
Slower, because every time
the program is run, it must
be interpreted.
Dr. Ken Hoganson, © August 2014
Virtual Machine
• A third and more recent way to
translate high-level programs is with
a Virtual Machine (or byte-code
interpreter). Java is an example.
• Separates translation into two steps.
– Convert the program to “byte-code”
– The “byte-code” is then interpreted by
a virtual machine.
Dr. Ken Hoganson, © August 2014
Virtual Machine
• The virtual machine/byte-code interpreter
makes programs transportable and deviceindependent.
• Converted byte-code can move over the
internet.
Dr. Ken Hoganson, © August 2014
Virtual Machine
• Each different processor/machine needs its
own virtual machine, which will be different
from CPU to CPU.
• Different because of different machine
codes and operating systems.
Dr. Ken Hoganson, © August 2014
“R” is
• A structured programming language (no
objects or agents)
• With extensions for Big Data – functions and
techniques for manipulating large data sets
using parallel opportunities.
• An interpreted language, running on a
Virtual Machine written in a language called
“S”. S code is compiled, using a complier
for the platform.
• The “R” interpreter is compiled “S” code.
Dr. Ken Hoganson, © August 2014
End of Lecture
End
Of
Today’s
Lecture.
Dr. Ken Hoganson, © August 2014