Uploaded by Emma sweya

Guide to Optimizing LTE Service Drops pd

advertisement
Security Level: internal
Guide to Optimizing
LTE Service Drops
www.huawei.com
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Change History
Date
Version
Description
2012.1.10
1.0
Completed
the draft.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Reviewer
Page 2
Author
Abstract
This document
 Defines the call drop rate.
 Describes how to use the related counters to diagnose a call drop
and to analyze factors influencing the KPI.
 Describes common diagnosis methods and standard actions to be
taken by front-line engineers to handle a call drop problem.
 Describes the deliverables that the front-line engineers must
submit to R&D engineers if the front-line engineers fail to solve the
problem after taking the standard actions
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 3
Content
• Definition of the Service Drop Rate
• Symptoms of a Service Drop
• Cause Analysis and Data Processing
• Checklist and Deliverables
• Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Calculation of the Call Drop Rate on the UE Side (1/3)
 Call Drop Rate = eRAB AbnormRel / eRAB Setup Success x 100%
where eRAB AbnormRel is the number of e-RAB abnormal releases and
eRAB Setup Success is the number of successful e-RAB setup events.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 5
Calculation of the Call Drop Rate on the UE Side (2/3)
•
I.
eRAB AbnormRel is calculated by Huawei Genex PA as follows:
eRAB AbnormRel increments by 1 if the UE

Does not receive the DEACTIVATE EPS BEARER CONTEXT REQUEST message and,

Does not receive the DETACH REQUEST message from the MME and,

Does not send the DETACH REQUEST message and,

Receives the RRCConnectionReconfiguration message containing the IE drb-ToReleaseList.
In this case, if the ERAB num minus the eps-BearerIdentity contained in the ReleaseList is 0, the UE
transits to RRC_Idle mode.
II.
eRAB AbnormRel increments by 1 if the UE

Does not receive the DEACTIVATE EPS BEARER CONTEXT REQUEST message and,

Does not receive the DETACH REQUEST message from the MME and,

Does not send the DETACH REQUEST message and,

Receives the RRCConnectionRelease message and the RLC layer performs data transmission in the
last 4s in any direction.
In this case, the UE directly transits to RRC_Idle mode.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 6
Calculation of the Call Drop Rate on the UE Side (3/3)
III. ERABAbnormalRel increments by 1 for each released e-RAB if the UE has
established e-RAB(s) and enters the RRC_Idle mode before receiving the
RRCConnectionRelease message.
IV. ERABAbnormalRel increments by 1 if the UE initiates the RRC connection setup
request without receiving the RRC Connection Reconfiguration, Deactivate EPS
Bearer Context Request, Detach Request, RRC State, or RRC Connection
Release message.
V. ERABAbnormalRel increments by 1 if the event RRCReestablishFail occurs.
The timestamp contained in these two events is the same.
Note: The acceptance criteria of some customers may require that all RRC
reestablishments initiated by the UE be counted as service drops.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 7
Calculation of the Call Drop Rate on the Network Side
 Call Drop Rate = L.E-RAB.AbnormRel / (L.E-RAB.NormRel + L.ERAB.AbnormRel) x 100%
where L.E-RAB.AbnormRel is the number of e-RAB abnormal releases
and L.E-RAB.NormRel is the number of e-RAB normal releases.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 8
Counters Recorded by the Network
•
•
As shown in point A of Fig1, if the eNodeB sends the E-RAB RELEASE INDICATION message
containing a cause value that is not "Normal Release", "User Inactivity", "cs fallback triggered",
or "Inter-RAT redirection", L.E-RAB.AbnormRel increments by 1. If the E-RAB RELEASE
INDICATION message requests release of multiple e-RABs, L.E-RAB.AbnormRel increments
by 1 for each e-RAB.
As shown in point A of Fig2, when the eNodeB sends the UE CONTEXT RELEASE REQUEST
message to the MME, the eNodeB releases all e-RABs of the UE. If the release cause is not
"Normal Release", "User Inactivity", "cs fallback triggered", or "Inter-RAT redirection", L.ERAB.AbnormRel increments by 1 for each release.
Note:
The eRAB Release procedure releases one or multiple e-RABs. After the procedure, at least the default bearer is maintained.
The UE Context Release procedure releases all connections. No bearer is maintained after this procedure.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 9
Counters That Count Abnormal
Releases by the Network (1/4)
•
Currently, there are five counters that count e-RAB abnormal releases by the
network:

L.E-RAB.AbnormRel.Radio (Number of e-RAB abnormal releases caused by the
eNodeB)

L.E-RAB.AbnormRel.TNL (Number of e-RAB abnormal releases caused by the
transmission network)

L.E-RAB.AbnormRel.Cong (Number of e-RAB abnormal releases caused by network
congestion)

L.E-RAB.AbnormRel.HOFailure (Number of e-RAB abnormal releases caused by
handover failures)

L.E-RAB.AbnormRel.MME (Number of e-RAB abnormal releases caused by the EPC)
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 10
Counters That Count Abnormal
Releases by the Network (2/4)
•
Abnormal releases caused by the EPC

As shown in point A of Fig1 and Fig2, if the eNodeB
receives the E-RAB RELEASE COMMAND or UE
CONTEXT RELEASE COMMAND message from the
MME containing a cause value that is not “Normal
Release”, “Detach”, “User Inactivity”, “cs fallback triggered”,
or “Inter-RAT redirection”, L.E-RAB.AbnormRel.MME
increments by 1.
Note: L.E-RAB.AbnormRel.MME does not include L.ERAB.AbnormRel. A release initiated by the EPC is not
counted as a call drop in eRAN2.1SPC400 and later
versions.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 11
Counters That Count Abnormal
Releases by the Network (3/4)
•
Abnormal release not caused by the EPC

As shown in point A of Fig3, if the eNodeB sends the ERAB RELEASE INDICATION message to the MME with a
cause value indicating a radio error, L.ERAB.AbnormRel.Radio increments by 1. If the cause value
indicates a transmission network error, L.ERAB.AbnormRel.TNL increments by 1. If the cause value
indicates network congestion, L.E-RAB.AbnormRel.Cong
increments by 1. If the E-RAB RELEASE INDICATION
message requires release of multiple e-RABs, the
concerned counter increments by 1 for each e-RAB.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 12
Counters That Count Abnormal
Releases by the Network (4/4)
•
Abnormal release not caused by EPC

As shown in point A of Fig4, the eNodeB sends the UE
CONTEXT RELEASE REQUEST message to the MME to
release all e-RABs of the UE. If the cause value indicates a
radio error, L.E-RAB.AbnormRel.Radio increments by 1. If
the cause value indicates a transmission network error, L.ERAB.AbnormRel.TNL increments by 1. If the cause value
indicates network congestion, L.E-RAB.AbnormRel.Cong
increments by 1. This counter measures the abnormal
releases caused by preemption and resource congestion. If
the cause value indicates a handover failure, L.ERAB.AbnormRel.HOFailure increments by 1. The concerned
counter increments by 1 for each e-RAB. The counters no
longer increment when the MME sends the UE CONTEXT
RELEASE COMMAND message.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 13
Content
• Definition of the Service Drop Rate
• Symptoms of a Service Drop
• Cause Analysis and Data Processing
• Checklist and Deliverables
• Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Symptoms of a Call Drop as Observed in a Drive Test
Huawei test UE and UE Probe, or other commercial UEs and their signaling trace
software are used in a drive test. Symptoms shown by the traffic monitoring software
installed on the drive test computer are:
The throughput suddenly falls to a low value or zero.
The UE begins to receive system information when a handover is not complete
or when the UE is not in a re-establishment scenario.
Low
throughput
HUAWEI TECHNOLOGIES CO., LTD.
UE receives
system
information.
Huawei Confidential
Symptoms of a Call Drop as Observed from
the Traffic Statistics
The call drop problem of a commercial network is observed from the traffic statistics
and is reflected by the call drop rate and call drop count. The symptoms shown by the
traffic statistics exported from the M2000 are:
Global call drop rate, call drop count, and number of successful service setups
Call drop rate, call drop count, and time segment of top cells
Top cells occupy a high
percentage of call drops
High global call drop
rate
Time
segment of
call drops
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Content
• Definition of the Service Drop Rate
• Symptoms of a Service Drop
• Cause Analysis and Data Processing
• Checklist and Deliverables
• Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Steps in Analyzing a Call Drop Problem (1/2)
 Step 1: Determine the scope of the call drop problem:
 Analyze the traffic statistics and CHR to determine the scope of the call drop
problem, whether it is a top-cell or top-site problem, entire-network problem,
comprehensive problem, or top-terminal/top-UE problem.
Note: The analysis method varies for different scenarios. In a scenario of degraded performance after upgrade, you
need to compare the differences before and after the upgrade to determine the scope of the degradation. In a
scenario of inventory optimization where the call drop performance is below expectation or to be improved, you
need to determine the region of performance degradation.
 Step 2: Classify the causes of a call drop problem:
 Analyze the data sources to classify the causes of a call drop problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 18
Steps in Analyzing a Call Drop Problem (2/2)
 Step 3: Do as required by the checklist:
 Do as required by the checklist to determine the root cause and the closing action.
Note: The checklist is described in the next chapter.
 Step 4: Close the problem:
 Close the problem and evaluate the result. If the result is unsatisfactory, repeat the
preceding steps.
 If the closing actions are reproducible, consider the merits of copying the closing
actions to the entire network.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 19
Determining the Scope of a Call Drop Problem –
Principles of Selecting Top Cells (1/2)
The principles of selecting top cells vary for different scenarios.
 Scenario 1: Performance degradation in the time dimension:
 The call drop performance degrades after an upgrade, or degrades
suddenly due to unknown reasons.
Principles of selecting top cells
Calculate the difference of the counters (call drop rate and
number of e-RAB abnormal releases) before and after the upgrade of
each cell. Sort the cells by the difference of the call drop rate and the
difference of the number of e-RAB abnormal releases to obtain the top
cells of degraded call drop rate and top cells of number of e-RAB
abnormal releases.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 20
Determining the Scope of a Call Drop Problem –
Principles of Selecting Top Cells (2/2)
Scenario 2: Performance degradation in an inventory
optimization:
 The call drop performance of the live network is below expectation and
needs to be optimized to the target value.
Principles of selecting top cells
Sort the cells by the difference of the call drop rate and the
difference of the number of e-RAB abnormal releases to obtain the top
cells of degraded call drop rate and top cells of number of e-RAB
abnormal releases.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 21
Determining the Scope of a Call
Drop Problem – Criteria (1/2)
 Top-cell problem:
 After one-fifth of the top cells of high call drop rate and large number of e-RAB
abnormal releases are removed from calculation of the entire-network call drop
performance, if the performance is significantly improved to the expected value,
the call drop problem is defined as a top-cell problem.
 Entire-network problem
 After one-fifth of the top cells of high call drop rate and large number of e-RAB
abnormal releases are removed from calculation of the entire-network call drop
performance, if the performance is not significantly improved, the call drop
problem is defined as an entire-network problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 22
Determining the Scope of a Call
Drop Problem – Criteria (2/2)
 Comprehensive problem
 After one-fifth of the top cells of high call drop rate and large number of e-RAB
abnormal releases are removed from calculation of the entire-network call drop
performance, if the call drop performance is improved a little to a value slightly
below the expected value, the problem is defined as a comprehensive (top-cell
plus entire-network) problem.
 Top-UE problem
 After one-fifth of the top UEs are removed from calculation of the entire-network
call drop performance, if the performance is significantly improved to the
expected value, the problem is defined as a top-UE problem.
Note
Currently, the CHR of the LTE system provides no information about the terminal type. The terminal type is
provided by complaining users or inferred from the symptoms.
Due to security concerns, the eNodeB does not provide IMSI information. Therefore, top UEs can be
inferred only from the TMSI, not from the IMSI.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 23
Classifying the Causes of Call Drop
Problems – Obtaining Data Source
 After determining the scope of the call drop problem, analyze the following data
sources to infer the causes of the problem:
 Traffic statistics

Traffic statistics can be obtained from the M2000/PRS. For details, see
section 2.3.3 of LTE Service Drop Troubleshooting and Optimization
Guide.doc.
 Signaling trace on the network side

Signaling trace can be performed on the M2000. For details, see section 2.2.2
of LTE Service Drop Troubleshooting and Optimization Guide.doc.
 Drive test data

The drive test data can be obtained by performing a drive test. For details, see
section 2.1.3 of LTE Service Drop Troubleshooting and Optimization
Guide.doc.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 24
Classifying the Causes of Call Drop
Problems – Acquiring Tools
The following table lists available tools, usages, and acquisition method.
Tool Name
TraceViewer
Probe
Acquisition Method
Usage
Plays back signaling messages traced on the
Released together with the product version and integrated in
LMT.
OfflineTool file package.
Installed on Huawei UE and traces signaling,
scheduling, and signal quality information.
http://support.huawei.com/support/pages/editionctrl/catalog/Sh
owVersionDetail.do?actionFlag=clickNode&node=000001099
409&colID=ROOTENWEB|CO0000000174
Installed on Huawei UE, counts and analyzes
Assistant
NIC
PRS
OMstar
http://support.huawei.com/support/pages/editionctrl/catalog/Sh
owVersionDetail.do?actionFlag=clickNode&node=000001099
information.
389&colID=ROOTENWEB|CO0000000174
http://support.huawei.com/support/pages/editionctrl/catalog/Sh
owVersionDetail.do?actionFlag=clickNode&node=000001468
Batch data collection tool
041&colID=ROOTENWEB|CO0000000174
http://support.huawei.com/support/pages/editionctrl/catalog/Sh
Parses and analyzes traffic statistics of the
owVersionDetail.do?actionFlag=clickNode&node=000001430
eNodeB.
110&colID=ROOTWEB|CO0000000065
http://support.huawei.com/support/pages/editionctrl/catalog/Sh
Parses and analyzes original traffic statistics
owVersionDetail.do?actionFlag=clickNode&node=000001470
and CHR. Compares parameters.
066&colID=ROOTENWEB|CO0000000174
signaling, scheduling, and signal quality
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop Problems –
Interfaces of the Tracing Tools
Signaling Trace
Management
interface of the
M2000
Huawei UE
Probe
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop
Problems – Interfaces of the Analysis Tools
Huawei
UE
Probe
eNodeB TrafficReview
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop Problems –
Identifying Reconfiguration Messages
Identifying the RRC CONNECTION
RECONFIGURATION message
 Start the Message Browser to view the
details of the message.
If the message contains the IE cqiReportConfig, the message is a CQI
reconfiguration message.
If the message contains
the IE measConfig, the
message is a
measurement control
message.
If the message
contains the IE
targetPhysCellId, the
message is a
handover command.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Analyzing Traffic Statistics to Obtain
Causes of Call Drop Problems
•
Trend analysis

Obtain the call drop KPI of the global network for at least one to two
weeks, or two weeks before and one week after the upgrade in case
an upgrade has been performed. An example is shown in the upper
right figure.
•
Cause analysis

The counters indicate whether an abnormal release is caused by the
Uu interface or cell resource congestion, as shown in the lower left
figure.
•
Top analysis

Analysis of the traffic statistics can show the top cells and top
time segments that have the highest RRC connection setup
failure and e-RAB setup failure, as shown in the lower right
figure.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 29
Analyzing Signaling Trace to Obtain
Causes of a Call Drop
 The signaling trace clearly shows the signaling procedure that causes the call
drop and is effective for diagnosing problems found during a drive test or
reproducible problems. The disadvantage is that the trace must be performed
before the problem is triggered and that manual analysis is required. The signaling
trace cannot be used for irreproducible or small-probability problems.

Standard interface trace (a major means): Obtain top cells and top time segments by
analyzing the traffic statistics, start the standard interface trace on the top cells and at top
time segments, check which signaling procedure causes the call drop.

Single-UE global-network trace (a minor means): Query the IMSI of a TMSI from the
EPC, start the global-network trace of this IMSI. This method is effective for ensuring VIP
service.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 30
Analyzing Drive Test Data to Obtain
Causes of a Call Drop
 The advantage of a drive test is that the downlink signal strength, uplink
transmit power, bit error rate, and scheduling information can be obtained,
depending on the drive test software and UE capability. The disadvantage is
that in terms of signaling trace, only the signaling (including the RRC and
NAS messages) of the Uu interface is traced. Therefore, it is desirable to
combine a drive test with the signaling trace on the eNodeB.

Determine whether a call drop is caused by uplink or downlink problem.

The drive test can show whether the UE or eNodeB fails to receive the signaling
message; the downlink RSRP/SINR obtained from the drive test indicates the
downlink channel quality; the uplink transmit power indicates whether the uplink is
insufficient.

Determine whether a call drop is caused by UE.

The UE log shows whether the UE correctly processes the received signaling
messages and whether the UE suddenly does not send any data.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 31
Content
• Definition of the Service Drop Rate
• Symptoms of a Service Drop
• Cause Analysis and Data Processing
• Checklist and Deliverables
• Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Checklist for the Entire-Network Problem (1/2)
Standard
Action
Preliminary
analysis of
traffic
statistics
Analysis Action
1. Analyze the traffic
statistics to determine the
range and cause of the call
drop.
2. Analyze the trend of the
call drop rate to determine
change of the call drop rate.
Version check 1. Check whether the
eNodeB version is upgraded
or a new patch is installed.
2. Check whether the EPC
version is upgraded or a new
patch is installed.
Equipment
1. Global alarm check
and
transmission
alarms
Parameter
configuration
check
1. Global parameter
configuration check
2. Inspection of EPC
parameter change
HUAWEI TECHNOLOGIES CO., LTD.
Deliverables
Closing Action
1. Distribution of the
causes and top causes
2. Actions that affect
the call drop rate
1. Optimize the network according
to the top causes of the call drop
problem.
2. Describe the actions that affect
the call drop rate and the impact.
New and old version
numbers
Describe the changes that may
affect the call drop rate based on
the Release Notes.
Critical and major
alarms
1. Analyze the impact of alarms
on the call drop rate.
2. Clear the alarms and check
whether the call drop KPI is
restored.
1. Difference of
1. Determine whether the
parameters before and parameter change affects the callafter the upgrade
drop KPI.
2. Difference of
2. Roll back the parameters and
parameters compared check whether the call-drop KPI is
with the baseline
restored.
3. Purpose and impact
of the change of EPC
parameters
Huawei Confidential
Page 33
Checklist for the Entire-Network Problem (2/2)
Standard
Action
Operation
record check
Neighbor
relationship
check
Major event
check
Analysis Action
Check whether batch
operations affecting the
global network are
recorded and whether
neighboring cells and PCI
are re-planned.
Check for missed
configuration of neighbor
relationship. Deployment
of scattered sites causes
incorrect neighbor
relationship.
Check for allocation of a
large quantity of phone
numbers and major
activity (such as
ceremony, holidays, and
games)
Deliverables
Closing Action
Records of batch operations
affecting the global network
Analyze the impact of batch
operations on the call drop
rate. Determine whether the
batch operations can be rolled
back.
Missed configuration of neighbor
relationship
Add neighboring cells that are
not configured in the
neighbor relationship. Check
whether the call drop KPI is
restored.
1. Check the terminal type
involved in the number allocation,
quantity of number allocation,
and subscription policy.
2. Determine the range and time
segment of the major event.
Check whether the major
event is coupled to the
deterioration of the call drop
rate in the time dimension.
Note
The standard actions of a comprehensive problem (entire-network plus top-cell problem) are a
combination of the checklist for the entire-network problem and the checklist for the top-cell
problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 34
Checklist for the Top-cell Problem (1/2)
Standard
Action
Analysis Action
Preliminary
analysis of
traffic statistics
of top sites
1. Analyze the traffic
statistics to determine the
range and cause of the
call drop.
2. Analyze the trend of
the call drop rate to
determine change of the
call drop rate.
Version check Check whether the
of top sites
eNodeB version is
upgraded or a new patch
is installed.
Equipment and Alarm check of top sites
transmission
alarms of top
sites
Parameter
configuration
check of top
sites
Deliverables
Closing Action
1. Distribution of the causes
and top causes
2. Actions that affect the call
drop rate
1. Optimize the network according
to the top causes of the call drop
problem.
2. Describe the actions that affect
the call drop rate and the impact.
New and old version
numbers
Describe the changes that may
affect the call drop rate based on the
Release Notes.
Critical and major alarms
Analyze the impact of alarms on the
call drop rate. Clear the alarms
and check whether the call drop
KPI is restored.
Parameter configuration 1. Difference of parameters 1. Determine whether the parameter
check of top sites
before and after the upgrade
change affects the call-drop
2. Difference of parameters
KPI.
compared with the baseline 2. Roll back the parameters and
check whether the call-drop
KPI is restored.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 35
Checklist for the Top-cell Problem (2/2)
Standard
Action
Analysis Action
Deliverables
Closing Action
Operation
Check whether batch
Records of batch operations
Analyze the impact of batch
record check operations affecting the global affecting the global network
operations on the call drop
of top sites network are recorded and
rate. Determine whether the
whether neighboring cells and
batch operations can be rolled
PCI are re-planned.
back.
Neighbor
Check for missed
Missed configuration of
Add neighboring cells that are
relationship configuration of neighbor
neighbor relationship
not configured in the neighbor
check of top relationship. Scattered site
relationship. Check whether
cells
deployment or network
the call drop KPI is restored.
optimization leads to incorrect
neighbor relationship.
Coverage
Analyze the MCS and CQI
Coverage evaluation report of Perform network optimization
check of top contained in the traffic
top cells
to optimize the coverage.
cells
statistics, CHR, and drive test
data to check for coverage
overlap or weak coverage of
the top cells.
Interference Analyze the real-time trace
Interference evaluation report Find out and remove the
check of top data of the top cells to check of top cells
interference.
cells
for inter-modulation
interference and external
interference.
Check whether the major
Major event Check for allocation of a large 1. Check the terminal type
quantity
of
phone
numbers
and
involved
in
the
number
event is coupled to the
check
major activity (such as
allocation, quantity of number deterioration of the call drop
ceremony, holidays, and games)allocation, and subscription
rate in the time dimension.
in the vicinity of top cells.
policy.
2. Determine the range and time
segment of the major event.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 36
Diagnosing Radio Problems
•
Fault Description

If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.Radio, the abnormal release is
caused by Uu interface and occurs in a non-handover scenario.
•
Possible Cause

The abnormal release is caused by weak coverage, uplink interference, or abnormal UE that lead to
maximum number of RLC retransmissions, out-of-sync, or failure of signaling interactions. For details about
diagnosing the interference problem, see LTE RF Channel Check and Troubleshooting Guide.
•
Fault Handling Procedure

Analyze the CHR to check whether some top UEs have the highest count.

Analyze the cause values recorded in the CHR.

If the call drop is caused by a factor other than the signaling procedures, analyze the DRB scheduling at layer 2 to
determine whether the call drop is caused by weak coverage or interference.

If the call drop is caused by signaling procedures, observe the last ten signaling messages to determine the faulty
signaling procedure. Determine whether the fault of the signaling procedure is due to failure to receive or process the
signaling messages by either the UE or eNodeB.

The cause values recorded in the CHR are UEM_UECNT_REL_UE_RLC_UNRESTORE_IND,
UEM_UECNT_REL_UE_RESYNC_TIMEROUT_REL_CAUSE,
UEM_UECNT_REL_UE_RESYNC_DATA_IND_REL_CAUSE,
UEM_UECNT_REL_UE_RLF_RECOVER_FAIL_REL_CAUSE, UEM_UECNT_REL_RRC_REEST_SRB1_FAIL, and
UEM_UECNT_REL_RB_RECFG_FAIL_RRC_CONN_RECFG_CMP_FAIL.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 37
Diagnosing Handover Failures
•
Fault Description

If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.HOFailure, the
abnormal release is caused by outgoing handover failure.
•
Fault Handling Procedure

Obtain the top cells that have the highest counter L.E-RAB.AbnormRel.HOFailure,
analyze the pairs of source and target cells to obtain the top target cells that have the
highest failure rate.

Analyze the CHR of the source and target cells to determine whether the handover
failure is caused by failure to receive the handover command or random access failure.
Examples of the cause values are UEM_UECNT_REL_HO_OUT_X2_REL_BACK_FAIL
and UEM_UECNT_REL_HO_OUT_S1_REL_BACK_FAIL.

Optimize the handover parameters and neighbor relationship and check whether the call
drop KPI is improved.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 38
Diagnosing the Transmission Network
Problem
•
Fault Description

If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.TNL, the
abnormal release is caused by the transmission network.
•
Possible Cause

This call drop is caused by the abnormal transmission between the eNodeB and MME,
such as S1 interface break.
•
Fault Handling Procedure

Check for alarms about the transmission network. Clear the alarms and check whether
the problem of abnormal release is solved.

Observe the M2000 and check whether alarms about the transmission network are
recorded in the M2000.

Clear the alarms.

If abnormal releases are still recorded in the counter L.E-RAB.AbnormRel.TNL, collect
the logs and submit them to R&D engineers for further analysis.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 39
Diagnosing the Congestion Problem
•
Fault Description

If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.Cong, the
call drop is caused by resource congestion.
•
Possible Cause

This call drop is caused by radio resource congestion, such as exceeding the
maximum number of users.
•
Fault Handling Procedure

If the long-term congestion of a top cell leads to call drops, a short-term solution is
to enable the MLB algorithm or inter-operation to alleviate the load of the local cell.
The long-term solution is to expand the capacity.

Enable the MLB algorithm and check whether the congestion problem is alleviated.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 40
Diagnosing MME Faults
•
Fault Description

If an abnormal release is recorded in the counter L.E-RAB.AbnormRel.MME, the
abnormal release is initiated by the EPC. However, this abnormal release is not
recorded in the counter L.E-RAB.AbnormRel.
•
Fault Handling Procedure

Analyze the information of the EPC.

The cause value recorded in the CHR is UEM_UECNT_REL_MME_CMD. Analyze the
last ten signaling messages recorded in the CHR. If these messages show that the
problem is not caused by the eNodeB, focus on analysis of the EPC.

Analyze the S1 interface trace of the top cells to obtain the distribution of the cause value.

Discuss with the EPC engineers about the analysis result and signaling messages.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 41
Deliverables
•
Output of the activities in the checklist
•
If the front-line engineers fail to solve a difficult problem, collect the following
information and submit them to R&D engineers for further analysis:

One-click log (Mandatory)


Standard interface signaling (Mandatory)


Signaling trace of the S1, X2, and Uu interfaces
Network configuration (Mandatory)


Logs of the LMPT and LBBP of the top cells
Topology information, engineering parameters, and configuration files of the top sites
TTI trace (Optional)

IFTS trace and cell trace. These traces generate large amount of data. Only the data of the top cells and
top time segments is collected.

Single-UE trace (Optional)

The single-UE trace is used for in-depth diagnosis of top UEs. The entire-network single-UE trace can be
performed by using the IMSI queried from the EPC using the TMSI.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 42
Content
• Definition of the Service Drop Rate
• Symptoms of a Service Drop
• Cause Analysis and Data Processing
• Checklist and Deliverables
• Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Case 1: RRC Reestablishment Failure of a UE




As shown in the upper right figure, the cause value of the abnormal
release is RRC_REEST_SRB1_FAIL.
As shown in the middle right figure, this problem occurs repeatedly
from 11:51 o'clock to 18:49 o'clock in cell 0.
As shown in the lower right figure, the TMSI column shows that this
problem is contributed by a single UE whose TMSI is C2 B0 B0 40
and the cause value is "Reconfiguration Failure".
As shown in the lower left figure, the message type indicates that
this reconfiguration message is not a handover command or
measurement control. This message is probably for reconfiguration
of the CQI, SRS, or transmission mode (TM). Upon reception of the
RRC CONN REESTAB message, the UE does not respond.
Therefore, the eNodeB releases the UE in 5s.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 44
Case 2: UE Exception

Analysis of the CHR shows that the cause value of the abnormal release is
RLC_UNRESTORE_IND. This cause value indicates that the maximum number
of DRB RLC retransmissions is exceeded.

This problem occurs repeatedly from 10:51 to 13:49 in cell 2.

The TMSI column indicates that this problem is contributed by a single UE whose
TMSI is C2 7F 20 56.

The last 16 DRB scheduling procedures at a period of 64ms indicate that the
symptoms are similar. The symptoms are that the UE encounters suddenly
terminated data transmission shortly after the access. The duration from access
to release is tens of seconds to 2 minutes, indicating that the problem is not
caused by script test. The access type is MO-DATA, indicating that the user is
performing a service.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 45
Case 3: Poor Uplink Quality
•
As shown in the right figure, the uplink
RSRP and SINR received by the
eNodeB are poor from the last four 512
ms to the last sixteen 64 ms: The uplink
RSRP is below –135 dBm and the SINR
of the SRS and DMRS is below –3 dB,
indicating that the service drop is caused
by uplink weak coverage.
•
As shown in the left figure, from the last four 512
ms to the last sixteen 64 ms, the uplink RSRP is
about –130 dBm but the SINR of the uplink SRS
and DMRS is below –3 dB, indicating that the
service drop is due to weak coverage caused by
weak uplink interference.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 46
Case 4: Target Cell Reconfiguration Failure
•
Release cause
 TGT_ENB_RB_RECFG_FAIL is the cause value contained in the RB
reconfiguration failure message during a handover.
 The symptom is that after the UE is successfully handed over to the target
cell, the target eNodeB sends the PATH SWITCH REQ ACK message to
the MME and, in 100 ms, sends the UE CONTEXT REL REQ message
containing the cause value "unspecified". The lower left figure shows the
last ten signaling messages.
•
Fault diagnosis

During the handover procedure, the EPC delivers the PATH_SWITCH_ACK message
containing the downlink AMBR value that is inconsistent with the downlink AMBR
contained in the S1/X2 handover request. Analysis shows that this is a defect of the
RR module. The upper-layer control module of the RR module sends the AMBR
Update message to the RB module who thinks that there is no need to deliver a
reconfiguration message to the UE. Therefore, the RB module returns a null value to
the upper-layer control module. However, the upper-layer control module regards this
return value as an exception and releases the UE. This problem is solved in
eRAN2.1SPC430.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 47
Case 5: Service Drop Caused by
Inter-RAT Redirection
•
Release cause: Inter-RAT redirection

IRHO_REDIRECTION_TRIGER is the
release caused by inter-RAT redirection. In
eRAN2.1SPC400/SPH401, this cause value
is counted as a call drop, as shown in the
following figure.

This problem is solved in eRAN2.1 SPC420,
as shown in the right figure.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 48
Case 6: Service Drop Caused by
Abnormal Transmission
•
On December 11, the service drop rate of the entire network deteriorates for the Tele2 900M,
Telenor 900M, and Tele2 2.6G bands, as shown in the following figure.
•
Huawei field engineers discussed with the customer and suspected the EPC. However, they got
no positive answer.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 49
Case 7: Service Drop Caused by
Abnormal Uu Interface
•
Release cause

UE_RESYNC_TIMEROUT_REL_CAUSE indicates that the abnormal release is caused by resynchronization upon timeout of the
resynchronization timer. The same problem is recorded by the standard interface trace as "Radio Connection With UE Lost".

UE_RLC_UNRESTORE_IND indicates that the abnormal release is caused by restoration failure after exceeding the maximum number of RLC
retransmissions. The same problem is recorded by the standard interface as "Radio resources not available".

UE_RESYNC_DATA_IND_REL_CAUSE indicates that the abnormal release is caused by resynchronization triggered L2 report data. The same
problem is recorded by the standard interface trace as "Unspecified".
•
Cause analysis

The DRB scheduling information at the last 4 512ms and 16 64ms periods shows that most abnormal releases are caused by suddenly
terminated data transmission, possibly caused by unplugging the data card or UE fault. The following figure shows the CHR information.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 50
Case 8: RRC Connection
Reestablishment Failure
•
Release cause (“Radio Connection With UE Lost” recorded in the standard interface trace)

RRC_REEST_SRB1_FAIL indicates failure to restore SRB1 during RRC reestablishment.

The last 10 signaling messages as shown in the following figure indicates that after sending the
RRC_CONN_REESTAB message, the eNodeB fails to receive the RRC_CONN_REESTAB_CMP
message from the UE before the 5s timer on the Uu interface expires.

The L2 scheduling information shows that the UE sends the ACK message upon reception of the
RRC_CONN_REESTAB message.

We suspect that the problem is caused by failure of some UEs to send the
RRC_CONN_REESTAB_CMP message. Some Samsung UEs have such a problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 51
Thank you
www.huawei.com
Download