September 2019


Database Languages

A database system provides the languages to specify the database schema and manipulating the data in the database. Database language can be mainly categorized into two types:
  • Data Definition Language (DDL)
  • Data Manipulation Language (DML)



Data Definition Language (DDL)

Data definition language is the specification notation for defining the database schema.
  • Used by the Database Administrator (DBA) and database designers to specify the conceptual schema of a database.
  • In many DBMSs, the DDL is also used to define internal and external schema (views).
  • In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schema.
  • SDL is typically realized via DBMS commands provided to the DBA and database designer
  • Example:CREATE TABLE account(account-number CHAR(10), balance INTEGER)

  • Execution of the above DDL statement creates the account table.
  • It updates a special set of tables called the data dictionary.

Data dictionary: DDL compiler generates a set of tables stored in a data dictionary. Simply, Data dictionary is a special set of tables that contain the information about tables. Data dictionary contains metadata (i.e., data about data)

Metadata: Data that describes the database or one of its parts is called metadata. The schema of a table is an example of metadata

Data storage and definition language is a special type of DDL that is used to specify the storage structure and access methods used by the database system

The DDL provides the facilities to define 
v  Database scheme
v  Database tables
v  Integrity constraints
  •  Domain constraints
  •  Referential integrity (references constraint in SQL)
  •  Assertions
  •  Triggers
  •  Views

v  Security and Authorization
v  Modify the Scheme
v  The common DDL Commands are: CREATE, ALTER, DROP

Data Manipulation Language (DML)

A Data-manipulation language (DML) is a language that enables users to access or manipulate data organized by the appropriate data model. DML also known as query language. There are basically two classes of DML:

v  Procedural DMLs ( or Low-level DML ): In procedural DMLs, a user specifies what data are required and how to get those data 
v  Declarative (or nonprocedural or high-level ) DMLs: In declarative DMLs a  user specifies what data are needed without specifying how to get those data
v  The data manipulation is:
  • The retrieval of information stored in the database
  • The insertion of new information into the database
  • The deletion of information from the database
  • The modification of information stored in the database

v  The SELECT, INSERT, UPDATE, DELETE statements are common DML commands
v  Query: A query is a statement requesting the retrieval of information. SQL is the most widely used query language Select, insert, update, delete etc are the SQL DML statement



View of Data

The system hides certain details of how the data are stored and maintained and such view is an abstract view.
The Database System provides users with an abstract view of the data.


Data Abstraction

The database designers use the complex data structure to represent the data in the database and developer hides the complexity from user from several level of abstraction such as physical level, logical level, and view level. This process is called data abstraction.

Levels of Data Abstraction

The three levels of data abstraction can be shown as follows
 
Different levels of data abstraction
(Fig: Different levels of data abstraction)


Physical level

  • It is the lowest level of abstractions describes how the data are actually stored.
  • The physical level describes complex low-level data structure in details.
  • At this level records such as customer, account can be described as a block of consecutive storage location (e.g. byte, word)
  • The database system hides many of the lowest level storage details from database programmer. Database administrator may be aware of certain details of the physical organization of the data.


Logical level

  • It is the next higher level of data abstraction which describes what data are stored in the database, and what relationships exist among those data.
  • At the logical level, each record is described by a type definition
  • Programmers and database administrator work at this level of abstraction.


View level

  • It is the highest level of abstraction describes only a part of the database and hides some information to the user.
  • At view level, computer users see a set of application programs that hide details of data types. Similarly, at the view level several views of the database are defined and database user sees only these views.
  • Views also provides the security mechanism to prevent users from accessing certain parts of the database (that is views can also hide information (such as an employee ‘s salary) for security purposes.)


Instances and Schema

Instance (Database State)

The collection of information stored in the database at a particular moment is called an instance of the database. It is the actual content of the database at a particular point in time
  • The analogous to the value of a variable in a program
  • The actual data stored in a database at a particular moment in time. This includes the collection of all the data in the database.
  • Also called database instance (or occurrence or snapshot).
  • The term instance is also applied to individual database components, e.g. record instance, table instance, entity instance

Initial Database State

  • Refers to the database state when it is initially loaded into the system.

Valid State

  • A state that satisfies the structure and constraints of the database.

Schema 

The overall design of the database is called database schema. Simply, the database schema is the logical structure of the database  
  • The concept of database schema and instances can be understood by analogy to a program written in a programming language 
  • A database schema corresponds to the variable declaration and the values of the variables in a program at a point in time correspond to an instance of a database.
  • Example: The database consists of information about a set of customers and accounts and the relationship between them
  • The database systems have several schema and partitioned according to the level of abstraction such as physical and logical schema  
Schema diagram for the database

Schema diagram for the database



Database State Vs. Schema
  • The database schema changes very infrequently. 
  • The database state changes every time the database is updated. 
  • Schema is also called intention.
  • State is also called extension.


Physical and Logical Schema
  • Physical schema: The physical schema describes the database design at the physical level. The physical schema is hidden beneath the logical schema, and can usually be changed easily without affecting application program
  • Logical schema: The logical schema describes the database design at the logical level. Database design at the logical level. Programmer construct the application using logical schema.
  • Sub schema: The database system may also have several schema at the view level such schema are called sub schema that describe different views of the database.



Three-Schema Architecture

The goal of three-schema architecture goal is to separate the user applications and the physical database.
Not explicitly used in commercial DBMS products, but has been useful in explaining database system organization

Defines DBMS schemas at three levels:

Internal schema at the internal level to describe physical storage structures and access paths (e.g indexes).
  • Typically uses a physical data model.

Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. 
  • Uses a conceptual or an implementation data model.

External schemas at the external level (or view level) to describe the various user views. 
  • Usually uses the same data model as the conceptual schema.

The Three Schema Architecture

The Three Schema Architecture


Mappings among schema levels are needed to transform requests and data. 
  • Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution.
  • Data extracted from the internal DBMS level is reformatted to match the user‘s external view (e.g. formatting the results of an SQL query for display in a Web page)



Computer Network Topology and its Types | Computer Network

Topology means the physical design of a network including the devices, location and cable installation. Logical Topology refers to the fact that how data actually transfers in a network as opposed to its design.


Network topology is the layout pattern of interconnections of the various elements (links, nodes, etc.) of a computer network. Network topologies may be physical or logical. Physical topology means the physical design of a network including the devices, location and cable installation. Logical topology refers to how data is actually transferred in a network as opposed to its physical design. In general, physical topology relates to a core network whereas logical topology relates to basic network.

A local area network (LAN) is one example of a network that exhibits both a physical topology and a logical topology. Any given node in the LAN has one or more links to one or more nodes in the network and the mapping of these links and nodes in a graph results in a geometric shape that may be used to describe the physical topology of the network. Likewise, the mapping of the data flow between the nodes in the network determines the logical topology of the network. The physical and logical topologies may or may not be identical in any particular network.


Basic topology types:


The study of network topology recognizes seven basic topologies: 
  • Point-to-point topology
  • Bus (point-to-multipoint) topology
  • Star topology
  • Ring topology
  • Tree topology
  • Mesh topology
  • Hybrid topology

This classification is based on the interconnection between computers — be it physical or logical. The physical topology of a network is determined by the capabilities of the network access devices and media, the level of control or fault tolerance desired, and the cost associated with cabling or telecommunications circuits.  

Point-to-point

Point-to-Point Topology

Point-to-Point Topology


The simplest topology is a permanent link between two endpoints. Switched point-to-point topologies are the basic model of conventional telephony. The value of a permanent point-to-point network is the value of guaranteed, or nearly so, communications between the two endpoints. The value of an on-demand point-topoint connection is proportional to the number of potential pairs of subscribers, and has been expressed as Metcalfe's Law.

Types of Point-to-Point topology:

a  Permanent (Dedicated)

Easiest to understand, of the variations of point-to-point topology, is a point-to-point communications channel that appears, to the user, to be permanently associated with the two endpoints. A children's "tin-can telephone" is one example, with a microphone to a single public address speaker is another. These are examples of physical dedicated channels.
Within many switched telecommunications systems, it is possible to establish a permanent circuit. One example might be a telephone in the lobby of a public building, which is programmed to ring only the number of a telephone dispatcher. "Nailing down" a switched connection saves the cost of running a physical circuit between the two points. The resources in such a connection can be released when no longer needed, for example, a television circuit from a parade route back to the studio.

b  Switched


Using circuit-switching or packet-switching technologies, a point-to-point circuit can be set up dynamically, and dropped when no longer needed. This is the basic mode of conventional telephony.


Bus topology

Bus Topology

Bus Topology

In local area networks where bus topology is used, each node is connected to a single cable. Each computer or server is connected to the single bus cable through some kind of connector. A terminator is required at each end of the bus cable to prevent the signal from bouncing back and forth on the bus cable. A signal from the source travels in both directions to all machines connected on the bus cable until it finds the MAC address or IP address on the network that is the intended recipient. If the machine address does not match the intended address for the data, the machine ignores the data. Alternatively, if the data does match the machine address, the data is accepted. Since the bus topology consists of only one wire, it is rather inexpensive to implement when compared to other topologies. However, the low cost of implementing the technology is offset by the high cost of managing the network. Additionally, since only one cable is utilized, it can be the single point of failure. If the network cable breaks, the entire network will be down.

Types of Bus network topology:


a  Linear bus

The type of network topology in which all of the nodes of the network are connected to a common transmission medium which has exactly two endpoints (this is the 'bus', which is also commonly referred to as the backbone, or trunk) – all data that is transmitted between nodes in the network is transmitted over this common transmission medium and is able to be received by all nodes in the network virtually simultaneously (disregarding propagation delays).

The two endpoints of the common transmission medium are normally terminated with a device called a terminator that exhibits the characteristic impedance of the transmission medium and which dissipates or absorbs the energy that remains in the signal to prevent the signal from being reflected or propagated back onto the transmission medium in the opposite direction, which would cause interference with and degradation of the signals on the transmission medium.

b  Distributed bus

The type of network topology in which all of the nodes of the network are connected to a common transmission medium which has more than two endpoints that are created by adding branches to the main section of the transmission medium – the physical distributed bus topology functions in exactly the same fashion as the physical linear bus topology (i.e., all nodes share a common transmission medium).

  1. All of the endpoints of the common transmission medium are normally terminated with a device called a 'terminator'.
  2. The physical linear bus topology is sometimes considered to be a special case of the physical distributed bus topology – i.e., a distributed bus with no branching segments.
  3. The physical distributed bus topology is sometimes incorrectly referred to as a physical tree topology – however, although the physical distributed bus topology resembles the physical tree topology, it differs from the physical tree topology in that there is no central node to which any other nodes are connected, since this hierarchical functionality is replaced by the common bus.


Star topology

Start Topology

Start Topology

In local area networks with a star topology, each network host is connected to a central hub. In contrast to the bus topology, the star topology connects each node to the hub with a point-to-point connection. All traffic that traverses the network passes through the central hub. The hub acts as a signal booster or repeater. The star topology is considered the easiest topology to design and implement. An advantage of the star topology is the simplicity of adding additional nodes. The primary disadvantage of the star topology is that the hub represents a single point of failure.

  • A point-to-point link is sometimes categorized as a special instance of the physical star topology – therefore, the simplest type of network that is based upon the physical star topology would consist of one node with a single point-to-point link to a second node, the choice of which node is the 'hub' and which node is the 'spoke' being arbitrary.
  • After the special case of the point-to-point link, the next simplest type of network that is based upon the physical star topology would consist of one central node – the 'hub' – with two separate point-to-point links to two peripheral nodes – the 'spokes'.
  • Although most networks that are based upon the physical star topology are commonly implemented using a special device such as a hubor switch as the central node (i.e., the 'hub' of the star), it is also possible to implement a network that is based upon the physical star topology using a computer or even a simple common connection point as the 'hub' or central node.
  • Star networks may also be described as either broadcast multi-access(BMA) or non-broadcast multi-access (NBMA), depending on whether the technology of the network either automatically propagates a signal at the hub to all spokes, or only addresses individual spokes with each communication.

a Extended star 

A type of network topology in which a network that is based upon the physical star topology has one or more repeaters between the central node (the 'hub' of the star) and the peripheral or 'spoke' nodes, the repeaters being used to extend the maximum transmission distance of the point-to-point links between the central node and the peripheral nodes beyond that which is supported by the transmitter power of the central node or beyond that which is supported by the standard upon which the physical layer of the physical star network is based.
If the repeaters in a network that is based upon the physical extended star topology are replaced with hubs or switches, then a hybrid network topology is created that is referred to as a physical hierarchical star topology, although some texts make no distinction between the two topologies.

b Distributed Star 

A type of network topology that is composed of individual networks that are based upon the physical star topology connected together in a linear fashion – i.e.; 'daisy-chained' – with no central or top level connection point (e.g., two or more 'stacked' hubs, along with their associated star connected nodes or 'spokes').

Ring topology

Ring Topology

Ring Topology

A network topology that is set up in a circular fashion in which data travels around the ring in one direction and each device on the right acts as a repeater to keep the signal strong as it travels. Each device incorporates a receiver for the incoming signal and a transmitter to send the data on to the next device in the ring. The network is dependent on the ability of the signal to travel around the ring.

Types of Ring network topology:


4.a Token Ring

Token ring local area network (LAN) technology is a local area network protocol which resides at the data link layer (DLL) of the OSI model. It uses a special three-byte frame called a token that travels around the ring. Token-possession grants the possessor permission to transmit on the medium. Token ring frames travel completely around the loop.

4.2 Dual Ring (failover)

This structure consists of dual rings – the primary for data transfer; and the secondary is for reliability and robustness.

Mesh Topology

Mesh Topology

Mesh Topology

The value of fully meshed networks is proportional to the exponent of the number of subscribers, assuming that communicating groups of any two endpoints, up to and including all the endpoints, is approximated by Reed's Law.

The number of connections in a full mesh = n(n - 1) / 2

a  Fully connected mesh

The physical fully connected mesh topology is generally too costly and complex for practical networks, although the topology is used when there are only a small number of nodes to be interconnected.

b  Partially connected mesh

The type of network topology in which some of the nodes of the network are connected to more than one other node in the network with a point-to-point link – this makes it possible to take advantage of some of the redundancy that is provided by a physical fully connected mesh topology without the expense and complexity required for a connection between every node in the network.


In most practical networks that are based upon the physical partially connected mesh topology, all of the data that is transmitted between nodes in the network takes the shortest path (or an approximation of the shortest path) between nodes, except in the case of a failure or break in one of the links, in which case the data takes an alternative path to the destination. This requires that the nodes of the network possess some type of logical 'routing' algorithm to determine the correct path to use at any particular time.


Tree topology

Tree Topology

Tree Topology

Tree topology is also known as a hierarchy network.

The type of network topology in which a central 'root' node (the top level of the hierarchy) is connected to one or more other nodes that are one level lower in the hierarchy (i.e.; the second level) with a point-to-point link between each of the second level nodes and the top level central 'root' node, while each of the second level nodes that are connected to the top level central 'root' node will also have one or more other nodes that are one level lower in the hierarchy (i.e.; the third level) connected to it; also with a point-to-point link, the top level central 'root' node being the only node that has no other node above it in the hierarchy. Each node in the network having a specific fixed number of nodes connected to it at the next lower level in the hierarchy, the number; being referred to as the 'branching factor' of the hierarchical tree. This tree has individual peripheral nodes.

  • A network that is based upon the physical hierarchical topology must have at least three levels in the hierarchy of the tree, since a network with a central 'root' node and only one hierarchical level below it would exhibit the physical topology of a star.
  • A network that is based upon the physical hierarchical topology and with a branching factor of 1 would be classified as a physical linear topology.
  • The branching factor, f, is independent of the total number of nodes in the network and, therefore, if the nodes in the network require ports for connection to other nodes the total number of ports per node may be kept low even though the total number of nodes is large – this makes the effect of the cost of adding ports to each node totally dependent upon the branching factor and may therefore be kept as low as required without any effect upon the total number of nodes that are possible.
  • The total number of point-to-point links in a network that is based upon the physical hierarchical topology will be one less than the total number of nodes in the network.
  • If the nodes in a network that is based upon the physical hierarchical topology are required to perform any processing upon the data that is transmitted between nodes in the network, the nodes that are at higher levels in the hierarchy will be required to perform more processing operations on behalf of other nodes than the nodes that are lower in the hierarchy. Such a type of network topology is very useful and highly recommended.




Computer Network | Ethernet (802.3)

Ethernet is the most widely-installed local area network (LAN) technology. Ethernet was originally developed by Xerox from an earlier specification called Aloha net and then developed further by Xerox, DEC, and Intel. An Ethernet LAN typically uses coaxial cable or special grades of twisted pair wires. Ethernet is also used in wireless. Ethernet is standardized as IEEE 802.3 that specifies a CSMA/CD bus network. CSMA/CD (Carrier Sense Multiple Access / Collision Detect) is used to detect the collision in the network. An Ethernet CSMA/CD can be implemented using a Bus or even a Star topology.



Common Ethernet types:

Common Name
Speed
Alternative Name
Name Of IEEE Standard
Cable Type, Maximum Length
Ethernet
10 Mbps
10BASE-T
IEEE 802.3
Copper, 100 m
Fast Ethernet
100 Mbps
100BASE-TX
IEEE 802.3u
Copper, 100 m
Gigabit Ethernet
1000 Mbps
1000BASE-LX
1000BASE-SX
IEEE 802.3z
Fiber, 550 m (SX)
5 km (LX)
Gigabit Ethernet
1000 Mbps
1000BASE-T
IEEE 802.3ab
100 m


802.3 Ethernet frame format:

802.3 Ethernet frame format:

802.3 Ethernet frame format:

Explanation :

Preamble: An 8-byte pattern of binary 1s and 0s used to establish synchronization. The last bit of the preamble is always 0.

Start Frame Delimiter: An 8-bit pattern indicating the formal start of the frame.

Destination Address: An address specifying a specific destination station, a group of stations, or all stations in the LAN. This address can be 16 bits or 48 bits in length, but all stations in the LAN must adhere to one format or the other. 

Source Address: The address of the originating station. This address has the same length requirements as the destination address.

Length: The length measured in bytes, of the actual data, indicating the 802.2 header. This is a 16 bit field. 

Following the header is the 802.2 header and the actual data. At the end of the data is the 802.3 trailer, which includes:

Padding: Extra, non-data bytes can be inserted into the frame to make the overall frame length more palatable to the physical network.

Frame check sequence: At the end of the frame is a 32 bit Cyclic Redundancy Check (CRC) on the data starting with the destination addresses the terminating at the end of the data (not including any padding)


CSMA/CD (Carrier Sense Multiple Access / Collision Detection)

  • CSMA/CD is the protocol used in Ethernet networks to ensure that only one network node is transmitting on the network wire at any one time.
  • Carrier Sense means that every Ethernet device listens to the Ethernet wire before it attempts to transmit. If the Ethernet device senses that another device is transmitting, it will wait to transmit.
  • Multiple Access means that more than one Ethernet device can be sensing (listening and waiting to transmit) at a time.
  • Collision Detection means that when multiple Ethernet devices accidentally transmit at the same time, they are able to detect this error.
CSMA/CD (Carrier Sense Multiple Access / Collision Detection)
CSMA/CD (Carrier Sense Multiple Access / Collision Detection)

Advantages of Ethernet

  • Short access delay at low load.
  • MAC management is relatively simple, distributed.
  • Huge installed base and significant operational experience.


Disadvantages of Ethernet

  • Operation at high traffic load is problematic.
  • Variable/Unbounded delay-not well suited for real-time applications.

Why CISC?

          Compiler simplification?
         Disputed…
         Complex machine instructions harder to exploit
         Optimization more difficult
          Smaller programs?
         Program takes up less memory but…
         Memory is now cheap
         May not occupy less bits, just look shorter in symbolic form
          More instructions require longer op-codes
          Register references require fewer bits
          Faster programs?
         Bias towards use of simpler instructions
         More complex control unit
         Microprogram control store larger
         thus simple instructions take longer to execute
         It is far from clear that CISC is the appropriate solution

CISC Characteristics

          A large number of instructions-typically from 100 to 250 instructions
          Some instructions that perform specialized tasks and are used frequently
          A large variety of addressing modes-typically from 5 to 20 different modes
          Variable-length instruction formats
          Instructions that manipulate operands in memory

RISC Characteristics

          One instruction per cycle
          Register to register operations
          Few, simple addressing modes
          Few, simple instruction formats
          Hardwired design (no microcode)
          Fixed instruction format, easily decoded instruction format
          More compile time/effort
          Relatively few instruction
          Memory access limited to load and store instruction
          Relatively large number of registers in processor unit
          Use of overlapped register windows to speed-up procedure call and return
          Efficient instruction pipeline
          Compiler support for efficient translation of HLL programs into machine language programs.
          E.g.: Sun SPARC, Berkeley RISC1

RISC vs CISC

          Not clear cut
          Many designs borrow from both philosophies
          e.g. PowerPC and Pentium II …….

The Next Step - RISCReduced Instruction Set Computer

  Key features

         Large number of general purpose registers
         or use of compiler technology to optimize register use
         Limited and simple instruction set
         Emphasis on optimising the instruction pipeline
RISC VS CISC

RISC VS CISC


Advantages of CISC

          Microprogramming is as easy as assembly language to implement and much less expensive than hardwiring a unit
          The ease of microcoding new instructions allowed the designers to make CISC machines upwardly compatible – a new computer could run the same programs as earlier computers because the new computer would contain a superset of instructions of the earlier computers.
          As each instruction became more capable, fewer instructions could be used to implement a given task. This made more efficient use of the relatively slow main memory.
          Because microprogram instruction sets can be written to match the constructions of high-level languages, the compiler does not have to be as complicated.

Disadvantages of CISC

          Earlier generations of a processor family were generally contained as a subset in every new version- so instruction set and chip hardware become more complex with each generation of computers.
          So that as many instructions as possible could be stored in memory with the least possible wasted space, individual instructions could be of almost any length – this means that different  instructions will take different amount of clock time to execute, slowing down the overall performance of the machine.
          Many specialized instructions are not used frequently enough to justify their existence – approximately only 20% of the available instructions are used in a typical program.
          CISC instructions typically se the condition codes as a side effect of the instruction. Not only does setting the condition codes take time, but programmers have to remember to examine the condition code bits before a subsequent instruction changes them.

Advantages of RISC

          Speedà Since a simplified instruction set allows for a pipelined, superscalar design RISC processors often achieve 2-4 times the performance of CISC processors using comparable semiconductor technology and the same clock rates.
          Simpler hardware à Because the instruction set of a RISC processor is so simple, it uses up much less chip space; extra functions, such as memory management units or floating point arithmetic units, can also be placed on the same chip. Smaller chips allow a semiconductor manufacture to place more parts on a single silicon wafer, which can lower the per-chip cost dramatically.
          Short design cycle à Since RISC processors are simpler than corresponding CISC processors, they be designed  more quickly, and can take advantage of other technological developments sooner than corresponding CISC designs, leading to greater leaps in performance between generations.

Disadvantages of RISC

          Code Quality àThe performance of a RISC processor depends greatly on the code that it is executing. If the programmer (or compiler) does a poor job of instruction scheduling, the processor can spend quite a bit of time stalling-waiting for the result of one instruction before it can proceed with a subsequent instruction.
          Debugging à Unfortunately, instruction scheduling can make debugging difficult. If scheduling (and other optimizations) is turned off, the machine-language instructions show a clear connection with their corresponding lines of source. Many RISC programmers debug their code in an un-optimized, unscheduled form and then turn on the scheduler and hope that the program continues to work in the same way.