PostgreSQL 소스 코드 판독 (65) - 검색 어 \ # 50 (make one rel 함수 \ # 15 - 연결 경로 \ # 4)

43922 단어
이 절 은 동적 계획 알고리즘 실현 (standard join search) 중의 join 을 대체적으로 소개 합 니 다.search_one_level->make_join_rel->populate_joinrel_with_paths 함수 의 주 구현 논리 입 니 다. 이 함 수 는 새로 생 성 된 joinrel 구조 접근 경 로 를 연결 합 니 다.
데이터 구조
SpecialJoinInfo
 /*
  * "Special join" info.
  *
  * One-sided outer joins constrain the order of joining partially but not
  * completely.  We flatten such joins into the planner's top-level list of
  * relations to join, but record information about each outer join in a
  * SpecialJoinInfo struct.  These structs are kept in the PlannerInfo node's
  * join_info_list.
  *
  * Similarly, semijoins and antijoins created by flattening IN (subselect)
  * and EXISTS(subselect) clauses create partial constraints on join order.
  * These are likewise recorded in SpecialJoinInfo structs.
  *
  * We make SpecialJoinInfos for FULL JOINs even though there is no flexibility
  * of planning for them, because this simplifies make_join_rel()'s API.
  *
  * min_lefthand and min_righthand are the sets of base relids that must be
  * available on each side when performing the special join.  lhs_strict is
  * true if the special join's condition cannot succeed when the LHS variables
  * are all NULL (this means that an outer join can commute with upper-level
  * outer joins even if it appears in their RHS).  We don't bother to set
  * lhs_strict for FULL JOINs, however.
  *
  * It is not valid for either min_lefthand or min_righthand to be empty sets;
  * if they were, this would break the logic that enforces join order.
  *
  * syn_lefthand and syn_righthand are the sets of base relids that are
  * syntactically below this special join.  (These are needed to help compute
  * min_lefthand and min_righthand for higher joins.)
  *
  * delay_upper_joins is set true if we detect a pushed-down clause that has
  * to be evaluated after this join is formed (because it references the RHS).
  * Any outer joins that have such a clause and this join in their RHS cannot
  * commute with this join, because that would leave noplace to check the
  * pushed-down clause.  (We don't track this for FULL JOINs, either.)
  *
  * For a semijoin, we also extract the join operators and their RHS arguments
  * and set semi_operators, semi_rhs_exprs, semi_can_btree, and semi_can_hash.
  * This is done in support of possibly unique-ifying the RHS, so we don't
  * bother unless at least one of semi_can_btree and semi_can_hash can be set
  * true.  (You might expect that this information would be computed during
  * join planning; but it's helpful to have it available during planning of
  * parameterized table scans, so we store it in the SpecialJoinInfo structs.)
  *
  * jointype is never JOIN_RIGHT; a RIGHT JOIN is handled by switching
  * the inputs to make it a LEFT JOIN.  So the allowed values of jointype
  * in a join_info_list member are only LEFT, FULL, SEMI, or ANTI.
  *
  * For purposes of join selectivity estimation, we create transient
  * SpecialJoinInfo structures for regular inner joins; so it is possible
  * to have jointype == JOIN_INNER in such a structure, even though this is
  * not allowed within join_info_list.  We also create transient
  * SpecialJoinInfos with jointype == JOIN_INNER for outer joins, since for
  * cost estimation purposes it is sometimes useful to know the join size under
  * plain innerjoin semantics.  Note that lhs_strict, delay_upper_joins, and
  * of course the semi_xxx fields are not set meaningfully within such structs.
  */
 
 typedef struct SpecialJoinInfo
 {
     NodeTag     type;
     Relids      min_lefthand;   /* base relids in minimum LHS for join */
     Relids      min_righthand;  /* base relids in minimum RHS for join */
     Relids      syn_lefthand;   /* base relids syntactically within LHS */
     Relids      syn_righthand;  /* base relids syntactically within RHS */
     JoinType    jointype;       /* always INNER, LEFT, FULL, SEMI, or ANTI */
     bool        lhs_strict;     /* joinclause is strict for some LHS rel */
     bool        delay_upper_joins;  /* can't commute with upper RHS */
     /* Remaining fields are set only for JOIN_SEMI jointype: */
     bool        semi_can_btree; /* true if semi_operators are all btree */
     bool        semi_can_hash;  /* true if semi_operators are all hash */
     List       *semi_operators; /* OIDs of equality join operators */
     List       *semi_rhs_exprs; /* righthand-side expressions of these ops */
 } SpecialJoinInfo;


RelOptInfo
 typedef enum RelOptKind
 {
     RELOPT_BASEREL,//    (   /    )
     RELOPT_JOINREL,//       ,                        
     RELOPT_OTHER_MEMBER_REL,
     RELOPT_OTHER_JOINREL,
     RELOPT_UPPER_REL,//     
     RELOPT_OTHER_UPPER_REL,
     RELOPT_DEADREL
 } RelOptKind;
 
 /*
  * Is the given relation a simple relation i.e a base or "other" member
  * relation?
  */
 #define IS_SIMPLE_REL(rel) \
     ((rel)->reloptkind == RELOPT_BASEREL || \
      (rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
 
 /* Is the given relation a join relation? */
 #define IS_JOIN_REL(rel)    \
     ((rel)->reloptkind == RELOPT_JOINREL || \
      (rel)->reloptkind == RELOPT_OTHER_JOINREL)
 
 /* Is the given relation an upper relation? */
 #define IS_UPPER_REL(rel)   \
     ((rel)->reloptkind == RELOPT_UPPER_REL || \
      (rel)->reloptkind == RELOPT_OTHER_UPPER_REL)
 
 /* Is the given relation an "other" relation? */
 #define IS_OTHER_REL(rel) \
     ((rel)->reloptkind == RELOPT_OTHER_MEMBER_REL || \
      (rel)->reloptkind == RELOPT_OTHER_JOINREL || \
      (rel)->reloptkind == RELOPT_OTHER_UPPER_REL)
 
 typedef struct RelOptInfo
 {
     NodeTag     type;//    
 
     RelOptKind  reloptkind;//RelOpt  
 
     /* all relations included in this RelOptInfo */
     Relids      relids;         /*Relids(rtindex)   set of base relids (rangetable indexes) */
 
     /* size estimates generated by planner */
     double      rows;           /*          estimated number of result tuples */
 
     /* per-relation planner control flags */
     bool        consider_startup;   /*        ? ,             keep cheap-startup-cost paths? */
     bool        consider_param_startup; /*       ?    ditto, for parameterized paths? */
     bool        consider_parallel;  /*           consider parallel paths? */
 
     /* default result targetlist for Paths scanning this relation */
     struct PathTarget *reltarget;   /*   Relation       list of Vars/Exprs, cost, width */
 
     /* materialization information */
     List       *pathlist;       /*       Path structures */
     List       *ppilist;        /*               ParamPathInfos used in pathlist */
     List       *partial_pathlist;   /* partial Paths */
     struct Path *cheapest_startup_path;//         
     struct Path *cheapest_total_path;//         
     struct Path *cheapest_unique_path;//             
     List       *cheapest_parameterized_paths;//        ?    
 
     /* parameterization information needed for both base rels and join rels */
     /* (see also lateral_vars and lateral_referencers) */
     Relids      direct_lateral_relids;  /*  lateral  ,    Relids rels directly laterally referenced */
     Relids      lateral_relids; /* minimum parameterization of rel */
 
     /* information about a base rel (not set for join rels!) */
     //reloptkind=RELOPT_BASEREL        
     Index       relid;          /* Relation ID */
     Oid         reltablespace;  /*     containing tablespace */
     RTEKind     rtekind;        /*   ?   ?      ?RELATION, SUBQUERY, FUNCTION, etc */
     AttrNumber  min_attr;       /*         smallest attrno of rel (often <0) */
     AttrNumber  max_attr;       /*         largest attrno of rel */
     Relids     *attr_needed;    /*    array indexed [min_attr .. max_attr] */
     int32      *attr_widths;    /*      array indexed [min_attr .. max_attr] */
     List       *lateral_vars;   /*      Vars/PHVs LATERAL Vars and PHVs referenced by rel */
     Relids      lateral_referencers;    /*      Relids rels that reference me laterally */
     List       *indexlist;      /*     IndexOptInfo   list of IndexOptInfo */
     List       *statlist;       /*        list of StatisticExtInfo */
     BlockNumber pages;          /*    size estimates derived from pg_class */
     double      tuples;         /*     */
     double      allvisfrac;     /* ? */
     PlannerInfo *subroot;       /*      ,      root if subquery */
     List       *subplan_params; /*      ,         if subquery */
     int         rel_parallel_workers;   /*     ,     workers? wanted number of parallel workers */
 
     /* Information about foreign tables and foreign joins */
     //FWD    
     Oid         serverid;       /* identifies server for the table or join */
     Oid         userid;         /* identifies user to check access as */
     bool        useridiscurrent;    /* join is only valid for current user */
     /* use "struct FdwRoutine" to avoid including fdwapi.h here */
     struct FdwRoutine *fdwroutine;
     void       *fdw_private;
 
     /* cache space for remembering if we have proven this relation unique */
     //   ,      Relids  
     List       *unique_for_rels;    /* known unique for these other relid
                                      * set(s) */
     List       *non_unique_for_rels;    /*    ,    Relids   known not unique for these set(s) */
 
     /* used by various scans and joins: */
     List       *baserestrictinfo;   /*       ,       RestrictInfo structures (if base rel) */
     QualCost    baserestrictcost;   /*           ? cost of evaluating the above */
     Index       baserestrict_min_security;  /*        min security_level found in
                                              * baserestrictinfo */
     List       *joininfo;       /*             RestrictInfo structures for join clauses
                                  * involving this rel */
     bool        has_eclass_joins;   /*          ? T means joininfo is incomplete */
 
     /* used by partitionwise joins: */
     bool        consider_partitionwise_join;    /*   ? consider partitionwise
                                                  * join paths? (if
                                                  * partitioned rel) */
     Relids      top_parent_relids;  /* Relids of topmost parents (if "other"
                                      * rel) */
 
     /* used for partitioned relations */
     //     
     PartitionScheme part_scheme;    /*    schema Partitioning scheme. */
     int         nparts;         /*     number of partitions */
     struct PartitionBoundInfoData *boundinfo;   /*        Partition bounds */
     List       *partition_qual; /*      partition constraint */
     struct RelOptInfo **part_rels;  /*    RelOptInfo   Array of RelOptInfos of partitions,
                                      * stored in the same order of bounds */
     List      **partexprs;      /*          Non-nullable partition key expressions. */
     List      **nullable_partexprs; /*            Nullable partition key expressions. */
     List       *partitioned_child_rels; /* RT Indexes   List of RT indexes. */
 } RelOptInfo;


2. 소스 코드 해독
join_search_one_level - >... (예: make rels by clause joins) - > makejoin_rel->populate_joinrel_with_paths 함 수 는 새로 생 성 된 연결 joinrel (연결 에 참여 하 는 relations 지정) 구조 접근 경 로 를 제공 합 니 다. 입력 매개 변수 에 있 는 sjinfo (SpecialJoinInfo 구조 체) 는 연결 에 대한 상세 한 정 보 를 제공 합 니 다. 제한 조건 링크 restrictlist (List) 는 연결 조건 절 과 주어진 연결 관계 에 적용 되 는 다른 조건 절 을 포함 합 니 다.

//-------------------------------------------------------------------- populate_joinrel_with_paths
/*
 * populate_joinrel_with_paths
 *    Add paths to the given joinrel for given pair of joining relations. The
 *    SpecialJoinInfo provides details about the join and the restrictlist
 *    contains the join clauses and the other clauses applicable for given pair
 *    of the joining relations.
 *           joinrel(       relations)      .
 *    SpecialJoinInfo            ,
 *                                    。
 */
static void
populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
                            RelOptInfo *rel2, RelOptInfo *joinrel,
                            SpecialJoinInfo *sjinfo, List *restrictlist)
{
    /*
     * Consider paths using each rel as both outer and inner.  Depending on
     * the join type, a provably empty outer or inner rel might mean the join
     * is provably empty too; in which case throw away any previously computed
     * paths and mark the join as dummy.  (We do it this way since it's
     * conceivable that dummy-ness of a multi-element join might only be
     * noticeable for certain construction paths.)
     *       rel            。
     *          ,                          ;
     *       ,            ,        (dummy)  。
     * 
     * Also, a provably constant-false join restriction typically means that
     * we can skip evaluating one or both sides of the join.  We do this by
     * marking the appropriate rel as dummy.  For outer joins, a
     * constant-false restriction that is pushed down still means the whole
     * join is dummy, while a non-pushed-down one means that no inner rows
     * will join so we can treat the inner rel as dummy.
     *   ,       -false                                。
   *         rel    (dummy)       。
   *      ,      -false               ,
   *                   ,         rel     。
   * 
     * We need only consider the jointypes that appear in join_info_list, plus
     * JOIN_INNER.
     *         join_info_list JOIN_INNER  jointype。
     */
    switch (sjinfo->jointype)
    {
        case JOIN_INNER:
            if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
                restriction_is_constant_false(restrictlist, joinrel, false))
            {
                mark_dummy_rel(joinrel);//      
                break;
            }
            add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                 JOIN_INNER, sjinfo,
                                 restrictlist);//    ,rel1   ,rel2   
            add_paths_to_joinrel(root, joinrel, rel2, rel1,
                                 JOIN_INNER, sjinfo,
                                 restrictlist);//    ,rel2   ,rel1   
            break;
        case JOIN_LEFT://  
            if (is_dummy_rel(rel1) ||
                restriction_is_constant_false(restrictlist, joinrel, true))
            {
                mark_dummy_rel(joinrel);
                break;
            }
            if (restriction_is_constant_false(restrictlist, joinrel, false) &&
                bms_is_subset(rel2->relids, sjinfo->syn_righthand))
                mark_dummy_rel(rel2);
            add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                 JOIN_LEFT, sjinfo,
                                 restrictlist);
            add_paths_to_joinrel(root, joinrel, rel2, rel1,
                                 JOIN_RIGHT, sjinfo,
                                 restrictlist);
            break;
        case JOIN_FULL://  
            if ((is_dummy_rel(rel1) && is_dummy_rel(rel2)) ||
                restriction_is_constant_false(restrictlist, joinrel, true))
            {
                mark_dummy_rel(joinrel);
                break;
            }
            add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                 JOIN_FULL, sjinfo,
                                 restrictlist);
            add_paths_to_joinrel(root, joinrel, rel2, rel1,
                                 JOIN_FULL, sjinfo,
                                 restrictlist);

            /*
             * If there are join quals that aren't mergeable or hashable, we
             * may not be able to build any valid plan.  Complain here so that
             * we can give a somewhat-useful error message.  (Since we have no
             * flexibility of planning for a full join, there's no chance of
             * succeeding later with another pair of input rels.)
             *              join quals,               。
         *      ,                。
       * (               ,          rels         。)
             */
            if (joinrel->pathlist == NIL)
                ereport(ERROR,
                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                         errmsg("FULL JOIN is only supported with merge-joinable or hash-joinable join conditions")));
            break;
        case JOIN_SEMI://   

            /*
             * We might have a normal semijoin, or a case where we don't have
             * enough rels to do the semijoin but can unique-ify the RHS and
             * then do an innerjoin (see comments in join_is_legal).  In the
             * latter case we can't apply JOIN_SEMI joining.
             *            ,         rels     ,
       *          RHS,     innerjoin(   join_is_legal    )。
       *        ,      JOIN_SEMI join。
             */
            if (bms_is_subset(sjinfo->min_lefthand, rel1->relids) &&
                bms_is_subset(sjinfo->min_righthand, rel2->relids))
            {
                if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
                    restriction_is_constant_false(restrictlist, joinrel, false))
                {
                    mark_dummy_rel(joinrel);
                    break;
                }
                add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                     JOIN_SEMI, sjinfo,
                                     restrictlist);
            }

            /*
             * If we know how to unique-ify the RHS and one input rel is
             * exactly the RHS (not a superset) we can consider unique-ifying
             * it and then doing a regular join.  (The create_unique_path
             * check here is probably redundant with what join_is_legal did,
             * but if so the check is cheap because it's cached.  So test
             * anyway to be sure.)
       *            RHS,    rel   RHS(    ),
       *           ,        。
       * (   create_unique_path   join_is_legal         ,
         *        ,             ,       。      ,       。
             */
            if (bms_equal(sjinfo->syn_righthand, rel2->relids) &&
                create_unique_path(root, rel2, rel2->cheapest_total_path,
                                   sjinfo) != NULL)
            {
                if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
                    restriction_is_constant_false(restrictlist, joinrel, false))
                {
                    mark_dummy_rel(joinrel);
                    break;
                }
                add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                     JOIN_UNIQUE_INNER, sjinfo,
                                     restrictlist);
                add_paths_to_joinrel(root, joinrel, rel2, rel1,
                                     JOIN_UNIQUE_OUTER, sjinfo,
                                     restrictlist);
            }
            break;
        case JOIN_ANTI://   
            if (is_dummy_rel(rel1) ||
                restriction_is_constant_false(restrictlist, joinrel, true))
            {
                mark_dummy_rel(joinrel);
                break;
            }
            if (restriction_is_constant_false(restrictlist, joinrel, false) &&
                bms_is_subset(rel2->relids, sjinfo->syn_righthand))
                mark_dummy_rel(rel2);
            add_paths_to_joinrel(root, joinrel, rel1, rel2,
                                 JOIN_ANTI, sjinfo,
                                 restrictlist);
            break;
        default://       
            /* other values not expected here */
            elog(ERROR, "unrecognized join type: %d", (int) sjinfo->jointype);
            break;
    }

    /*   partitionwise  . Apply partitionwise join technique, if possible. */
    try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
}


//------------------------------------------------------------------- add_paths_to_joinrel
/*
 * add_paths_to_joinrel
 *    Given a join relation and two component rels from which it can be made,
 *    consider all possible paths that use the two component rels as outer
 *    and inner rel respectively.  Add these paths to the join rel's pathlist
 *    if they survive comparison with other paths (and remove any existing
 *    paths that are dominated by these paths).
 *               rels,             ,       outer inner  .
 *                           ,
 *               rel      (                  )。
 * 
 * Modifies the pathlist field of the joinrel node to contain the best
 * paths found so far.
 *   joinrel->pathlist            .
 *
 * jointype is not necessarily the same as sjinfo->jointype; it might be
 * "flipped around" if we are considering joining the rels in the opposite
 * direction from what's indicated in sjinfo.
 * jointype    sjinfo->jointype  ,         sjinfo       rels,    “  ” 。
 * 
 * Also, this routine and others in this module accept the special JoinTypes
 * JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER to indicate that we should
 * unique-ify the outer or inner relation and then apply a regular inner
 * join.  These values are not allowed to propagate outside this module,
 * however.  Path cost estimation code may need to recognize that it's
 * dealing with such a case --- the combination of nominal jointype INNER
 * with sjinfo->jointype == JOIN_SEMI indicates that.
 *   ,                      JoinTypes(JOIN_UNIQUE_OUTER JOIN_UNIQUE_INNER),
 *                   ,             。
 *   ,               。
 *                  ,
 *           ———    INNER jointype sjinfo->jointype == JOIN_SEMI   。
 */
void
add_paths_to_joinrel(PlannerInfo *root,
                     RelOptInfo *joinrel,
                     RelOptInfo *outerrel,
                     RelOptInfo *innerrel,
                     JoinType jointype,
                     SpecialJoinInfo *sjinfo,
                     List *restrictlist)
{
    JoinPathExtraData extra;
    bool        mergejoin_allowed = true;
    ListCell   *lc;
    Relids      joinrelids;

    /*
     * PlannerInfo doesn't contain the SpecialJoinInfos created for joins
     * between child relations, even if there is a SpecialJoinInfo node for
     * the join between the topmost parents. So, while calculating Relids set
     * representing the restriction, consider relids of topmost parent of
     * partitions.
     * PlannerInfo               SpecialJoinInfo,
   *               SpecialJoinInfo  。
   *   ,          Relids   ,            Relids。
     */
    if (joinrel->reloptkind == RELOPT_OTHER_JOINREL)
        joinrelids = joinrel->top_parent_relids;
    else
        joinrelids = joinrel->relids;

    extra.restrictlist = restrictlist;
    extra.mergeclause_list = NIL;
    extra.sjinfo = sjinfo;
    extra.param_source_rels = NULL;

    /*
     * See if the inner relation is provably unique for this outer rel.
     *              .
     * 
     * We have some special cases: for JOIN_SEMI and JOIN_ANTI, it doesn't
     * matter since the executor can make the equivalent optimization anyway;
     * we need not expend planner cycles on proofs.  For JOIN_UNIQUE_INNER, we
     * must be considering a semijoin whose inner side is not provably unique
     * (else reduce_unique_semijoins would've simplified it), so there's no
     * point in calling innerrel_is_unique.  However, if the LHS covers all of
     * the semijoin's min_lefthand, then it's appropriate to set inner_unique
     * because the path produced by create_unique_path will be unique relative
     * to the LHS.  (If we have an LHS that's only part of the min_lefthand,
     * that is *not* true.)  For JOIN_UNIQUE_OUTER, pass JOIN_INNER to avoid
     * letting that value escape this module.
     *          :
     * 1.  JOIN_SEMI JOIN_ANTI,     ,                   ;
     *                  。
   *  2.  JOIN_UNIQUE_INNER,                (  reduce_unique_semijoin    ),
   *        innerrel_is_unique      。
   *      ,  LHS         min_left,       inner_unique,
   *      create_unique_path        LHS    。
   *    (  LHS  min_left    ,      )
   *      JOIN_UNIQUE_OUTER,  JOIN_INNER            。
     */
    switch (jointype)
    {
        case JOIN_SEMI:
        case JOIN_ANTI:
            extra.inner_unique = false; /* well, unproven */
            break;
        case JOIN_UNIQUE_INNER:
            extra.inner_unique = bms_is_subset(sjinfo->min_lefthand,
                                               outerrel->relids);
            break;
        case JOIN_UNIQUE_OUTER:
            extra.inner_unique = innerrel_is_unique(root,
                                                    joinrel->relids,
                                                    outerrel->relids,
                                                    innerrel,
                                                    JOIN_INNER,
                                                    restrictlist,
                                                    false);
            break;
        default:
            extra.inner_unique = innerrel_is_unique(root,
                                                    joinrel->relids,
                                                    outerrel->relids,
                                                    innerrel,
                                                    jointype,
                                                    restrictlist,
                                                    false);
            break;
    }

    /*
     * Find potential mergejoin clauses.  We can skip this if we are not
     * interested in doing a mergejoin.  However, mergejoin may be our only
     * way of implementing a full outer join, so override enable_mergejoin if
     * it's a full join.
     *      mergejoin  。     Merge Join,   。
   *   ,mergejoin                ,
     *   ,        ,    enable_mergejoin  。
     */
    if (enable_mergejoin || jointype == JOIN_FULL)
        extra.mergeclause_list = select_mergejoin_clauses(root,
                                                          joinrel,
                                                          outerrel,
                                                          innerrel,
                                                          restrictlist,
                                                          jointype,
                                                          &mergejoin_allowed);

    /*
     * If it's SEMI, ANTI, or inner_unique join, compute correction factors
     * for cost estimation.  These will be the same for all paths.
     *       、    inner_unique  ,            ,            。
     */
    if (jointype == JOIN_SEMI || jointype == JOIN_ANTI || extra.inner_unique)
        compute_semi_anti_join_factors(root, joinrel, outerrel, innerrel,
                                       jointype, sjinfo, restrictlist,
                                       &extra.semifactors);

    /*
     * Decide whether it's sensible to generate parameterized paths for this
     * joinrel, and if so, which relations such paths should require.  There
     * is usually no need to create a parameterized result path unless there
     * is a join order restriction that prevents joining one of our input rels
     * directly to the parameter source rel instead of joining to the other
     * input rel.  (But see allow_star_schema_join().)  This restriction
     * reduces the number of parameterized paths we have to deal with at
     * higher join levels, without compromising the quality of the resulting
     * plan.  We express the restriction as a Relids set that must overlap the
     * parameterization of any proposed join path.
     *                   ,   ,            。
   *                  ,            ,
   *                rel,         input rel(    allow_star_schema_join())。
   *                              ,            。
   *           Relids  ,                   。
     */
    foreach(lc, root->join_info_list)
    {
        SpecialJoinInfo *sjinfo2 = (SpecialJoinInfo *) lfirst(lc);

        /*
         * SJ is relevant to this join if we have some part of its RHS
         * (possibly not all of it), and haven't yet joined to its LHS.  (This
         * test is pretty simplistic, but should be sufficient considering the
         * join has already been proven legal.)  If the SJ is relevant, it
         * presents constraints for joining to anything not in its RHS.
     *         SJ,       RHS(      ),         LHS。
       * (        ,                ,        。)
     *   SJ    ,         RHS            。
         */
        if (bms_overlap(joinrelids, sjinfo2->min_righthand) &&
            !bms_overlap(joinrelids, sjinfo2->min_lefthand))
            extra.param_source_rels = bms_join(extra.param_source_rels,
                                               bms_difference(root->all_baserels,
                                                              sjinfo2->min_righthand));

        /*              .full joins constrain both sides symmetrically */
        if (sjinfo2->jointype == JOIN_FULL &&
            bms_overlap(joinrelids, sjinfo2->min_lefthand) &&
            !bms_overlap(joinrelids, sjinfo2->min_righthand))
            extra.param_source_rels = bms_join(extra.param_source_rels,
                                               bms_difference(root->all_baserels,
                                                              sjinfo2->min_lefthand));
    }

    /*
     * However, when a LATERAL subquery is involved, there will simply not be
     * any paths for the joinrel that aren't parameterized by whatever the
     * subquery is parameterized by, unless its parameterization is resolved
     * within the joinrel.  So we might as well allow additional dependencies
     * on whatever residual lateral dependencies the joinrel will have.
     *   ,      LATERAL    ,   joinrel       ,
   *   joinrel               。
   *   ,      joinrel         LATERAL        。
     */
    extra.param_source_rels = bms_add_members(extra.param_source_rels,
                                              joinrel->lateral_relids);

    /*
     * 1. Consider mergejoin paths where both relations must be explicitly
     * sorted.  Skip this if we can't mergejoin.
     * 1.   merge join    ,               。
     *        merge join,   。
     */
    if (mergejoin_allowed)
        sort_inner_and_outer(root, joinrel, outerrel, innerrel,
                             jointype, &extra);

    /*
     * 2. Consider paths where the outer relation need not be explicitly
     * sorted. This includes both nestloops and mergejoins where the outer
     * path is already ordered.  Again, skip this if we can't mergejoin.
     * (That's okay because we know that nestloop can't handle right/full
     * joins at all, so it wouldn't work in the prohibited cases either.)
     * 2.                 。
   *       nestloop mergejoin,           。
   *        ,    merge join,   。
     *    (nestloop      /    ,               )
     */
    if (mergejoin_allowed)
        match_unsorted_outer(root, joinrel, outerrel, innerrel,
                             jointype, &extra);

#ifdef NOT_USED

    /*
     * 3. Consider paths where the inner relation need not be explicitly
     * sorted.  This includes mergejoins only (nestloops were already built in
     * match_unsorted_outer).
     * 3.                 。    mergejoin( match_unsorted_outer      nestloop)。
     * (   )
   * 
     * Diked out as redundant 2/13/2000 -- tgl.  There isn't any really
     * significant difference between the inner and outer side of a mergejoin,
     * so match_unsorted_inner creates no paths that aren't equivalent to
     * those made by match_unsorted_outer when add_paths_to_joinrel() is
     * invoked with the two rels given in the other order.
     */
    if (mergejoin_allowed)
        match_unsorted_inner(root, joinrel, outerrel, innerrel,
                             jointype, &extra);
#endif

    /*
     * 4. Consider paths where both outer and inner relations must be hashed
     * before being joined.  As above, disregard enable_hashjoin for full
     * joins, because there may be no other alternative.
   * 4.             /             。
   *        ,      ,  enable_hashjoin,          。
     */
    if (enable_hashjoin || jointype == JOIN_FULL)
        hash_inner_and_outer(root, joinrel, outerrel, innerrel,
                             jointype, &extra);

    /*
     * 5. If inner and outer relations are foreign tables (or joins) belonging
     * to the same server and assigned to the same user to check access
     * permissions as, give the FDW a chance to push down joins.
     *                      (   ),
     *                , FDW       。
     */
    if (joinrel->fdwroutine &&
        joinrel->fdwroutine->GetForeignJoinPaths)
        joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
                                                 outerrel, innerrel,
                                                 jointype, &extra);

    /*
     * 6. Finally, give extensions a chance to manipulate the path list.
     * 6.   ,        .
     */
    if (set_join_pathlist_hook)
        set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
                               jointype, &extra);
}


3. 추적 분석
SQL 문 구 는 다음 과 같 습 니 다.
testdb=# explain verbose select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je 
from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je 
                        from t_grxx gr inner join t_jfxx jf 
                                       on gr.dwbh = dw.dwbh 
                                          and gr.grbh = jf.grbh) grjf 
order by dw.dwbh;
                                              QUERY PLAN                                               
-------------------------------------------------------------------------------------------------------
 Merge Join  (cost=18841.64..21009.94 rows=99850 width=47)
   Output: dw.dwmc, dw.dwbh, dw.dwdz, gr.grbh, gr.xm, jf.ny, jf.je
   Merge Cond: ((dw.dwbh)::text = (gr.dwbh)::text)
   ->  Index Scan using t_dwxx_pkey on public.t_dwxx dw  (cost=0.29..399.62 rows=10000 width=20)
         Output: dw.dwmc, dw.dwbh, dw.dwdz
   ->  Materialize  (cost=18836.82..19336.82 rows=100000 width=31)
         Output: gr.grbh, gr.xm, gr.dwbh, jf.ny, jf.je
         ->  Sort  (cost=18836.82..19086.82 rows=100000 width=31)
               Output: gr.grbh, gr.xm, gr.dwbh, jf.ny, jf.je
               Sort Key: gr.dwbh
               ->  Hash Join  (cost=3465.00..8138.00 rows=100000 width=31)
                     Output: gr.grbh, gr.xm, gr.dwbh, jf.ny, jf.je
                     Hash Cond: ((jf.grbh)::text = (gr.grbh)::text)
                     ->  Seq Scan on public.t_jfxx jf  (cost=0.00..1637.00 rows=100000 width=20)
                           Output: jf.ny, jf.je, jf.grbh
                     ->  Hash  (cost=1726.00..1726.00 rows=100000 width=16)
                           Output: gr.grbh, gr.xm, gr.dwbh
                           ->  Seq Scan on public.t_grxx gr  (cost=0.00..1726.00 rows=100000 width=16)
                                 Output: gr.grbh, gr.xm, gr.dwbh
(19 rows)

연결 에 참여 한 것 은 3 장의 기본 표 가 있 는데 각각 t 이다.dwxx/t_grxx/t_jfxx, 실행 계획 에서 볼 수 있 듯 이 order by dwbh 정렬 자구 가 존재 하기 때문에 최적화 기 '똑똑 한' 선택 Merge Join.
gdb 를 시작 하고 정지점 을 설정 하 며 level = 3 의 상황 만 고찰 합 니 다 (최종 결과)
(gdb) b join_search_one_level
Breakpoint 1 at 0x755667: file joinrels.c, line 67.
(gdb) c
Continuing.

Breakpoint 1, join_search_one_level (root=0x1cae678, level=2) at joinrels.c:67
67    List    **joinrels = root->join_rel_level;
(gdb) c
Continuing.

Breakpoint 1, join_search_one_level (root=0x1cae678, level=3) at joinrels.c:67
67    List    **joinrels = root->join_rel_level;
(gdb) 

추적 populatejoinrel_with_paths
(gdb) b populate_joinrel_with_paths
Breakpoint 2 at 0x75646d: file joinrels.c, line 780.

populate 입장joinrel_with_paths 함수
(gdb) c
Continuing.

Breakpoint 2, populate_joinrel_with_paths (root=0x1cae678, rel1=0x1d10978, rel2=0x1d09610, joinrel=0x1d131b8, 
    sjinfo=0x7ffef59baf20, restrictlist=0x1d135e8) at joinrels.c:780
780   switch (sjinfo->jointype)

입력 매개 변수 보기 1. root: simplerte_array 배열, 그 중 simplerel_array_size = 6, 6 개 아 이 템 존재, 1 - > 16734 / tdwxx,3->16742/t_grxx,4->16747/t_jfxx 2. rel 1: 1 호 와 3 호 연결 생 성의 관계, 즉 tdwxx 와 tgrxx 연결 3. rel 2: 4 호 RTE, 즉 tjfxx 4. joinrel: rel 1 과 rel 2 는 buildjoin_rel 함수 생 성의 연결 관계 5. sjinfo: 연결 정보, 연결 유형 은 내부 연결 JOININNER 6. restrictlist: 제약 조건 링크, tgrxx.grbh=t_jfxx.grbh
(gdb) p *root
$3 = {type = T_PlannerInfo, parse = 0x1cd7830, glob = 0x1cb8d38, query_level = 1, parent_root = 0x0, plan_params = 0x0, 
  outer_params = 0x0, simple_rel_array = 0x1d07af8, simple_rel_array_size = 6, simple_rte_array = 0x1d07b48, 
  all_baserels = 0x1d0ada8, nullable_baserels = 0x0, join_rel_list = 0x1d10e48, join_rel_hash = 0x0, 
  join_rel_level = 0x1d10930, join_cur_level = 3, init_plans = 0x0, cte_plan_ids = 0x0, multiexpr_params = 0x0, 
  eq_classes = 0x1d0a6d8, canon_pathkeys = 0x1d0ad28, left_join_clauses = 0x0, right_join_clauses = 0x0, 
  full_join_clauses = 0x0, join_info_list = 0x0, append_rel_list = 0x0, rowMarks = 0x0, placeholder_list = 0x0, 
  fkey_list = 0x0, query_pathkeys = 0x1d0ad78, group_pathkeys = 0x0, window_pathkeys = 0x0, distinct_pathkeys = 0x0, 
  sort_pathkeys = 0x1d0ad78, part_schemes = 0x0, initial_rels = 0x1d108c0, upper_rels = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
    0x0}, upper_targets = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, processed_tlist = 0x1cbb608, grouping_map = 0x0, 
  minmax_aggs = 0x0, planner_cxt = 0x1bfa040, total_table_pages = 1427, tuple_fraction = 0, limit_tuples = -1, 
  qual_security_level = 0, inhTargetKind = INHKIND_NONE, hasJoinRTEs = true, hasLateralRTEs = false, 
  hasDeletedRTEs = false, hasHavingQual = false, hasPseudoConstantQuals = false, hasRecursion = false, wt_param_id = -1, 
  non_recursive_path = 0x0, curOuterRels = 0x0, curOuterParams = 0x0, join_search_private = 0x0, partColsUpdated = false}
(gdb) p *root->simple_rte_array[1]
$4 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16734, relkind = 114 'r', tablesample = 0x0, subquery = 0x0, ...
...
(gdb) p *rel1->relids
$10 = {nwords = 1, words = 0x1d10b8c}
(gdb) p *rel1->relids->words
$11 = 10
(gdb) p *rel2->relids->words
$13 = 16
(gdb) p *joinrel->relids->words
$15 = 26
(gdb) p *sjinfo
$16 = {type = T_SpecialJoinInfo, min_lefthand = 0x1d10b88, min_righthand = 0x1d09518, syn_lefthand = 0x1d10b88, 
  syn_righthand = 0x1d09518, jointype = JOIN_INNER, lhs_strict = false, delay_upper_joins = false, semi_can_btree = false, 
  semi_can_hash = false, semi_operators = 0x0, semi_rhs_exprs = 0x0}
...
(gdb) p *(Var *)((RelabelType *)$args->head->data.ptr_value)->arg
$34 = {xpr = {type = T_Var}, varno = 3, varattno = 2, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, 
  varnoold = 3, varoattno = 2, location = 273}  -->t_grxx.grbh
(gdb) p *(Var *)((RelabelType *)$args->head->next->data.ptr_value)->arg
$35 = {xpr = {type = T_Var}, varno = 4, varattno = 1, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, 
  varnoold = 4, varoattno = 1, location = 283}  -->t_jfxx.grbh

JOIN 입장INNER 분기, 호출 함수 addpaths_to_joinrel
(gdb) 
789       add_paths_to_joinrel(root, joinrel, rel1, rel2,

add 입장paths_to_joinrel 함수
(gdb) step
add_paths_to_joinrel (root=0x1cae678, joinrel=0x1d131b8, outerrel=0x1d10978, innerrel=0x1d09610, jointype=JOIN_INNER, 
    sjinfo=0x7ffef59baf20, restrictlist=0x1d135e8) at joinpath.c:126
126   bool    mergejoin_allowed = true;

내 표 가 유일 하 게 검증 되 었 는 지 판단 합 니 다.
162   switch (jointype)
(gdb) 
182       extra.inner_unique = innerrel_is_unique(root,
(gdb) 
189       break;
(gdb) p extra.inner_unique
$36 = false

잠재 적 인 mergejoin 조건 을 찾 습 니 다.Merge Join 을 허용 하지 않 으 면 merge join 을 건 너 뛰 는 조건 은 tgrxx.grbh=t_jfxx.grbh
(gdb) n
198   if (enable_mergejoin || jointype == JOIN_FULL)
(gdb) 
199     extra.mergeclause_list = select_mergejoin_clauses(root,
(gdb) 
211   if (jointype == JOIN_SEMI || jointype == JOIN_ANTI || extra.inner_unique)
(gdb) p *(Var *)((RelabelType *)$args->head->data.ptr_value)->arg
$47 = {xpr = {type = T_Var}, varno = 3, varattno = 2, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, 
  varnoold = 3, varoattno = 2, location = 273} -->t_grxx.grbh
(gdb) p *(Var *)((RelabelType *)$args->head->next->data.ptr_value)->arg
$48 = {xpr = {type = T_Var}, varno = 4, varattno = 1, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, 
  varnoold = 4, varoattno = 1, location = 283} -->t_jfxx.grbh

이 연결 이 매개 변수 화 경 로 를 만 드 는 것 이 합 리 적 인지 확인 합 니 다. 그렇다면 이 경로 들 은 어떤 관계 가 필요 합 니까? (결 과 는: NULL)
(gdb) 
261   extra.param_source_rels = bms_add_members(extra.param_source_rels,
(gdb) 
268   if (mergejoin_allowed)
(gdb) p *extra.param_source_rels
Cannot access memory at address 0x0

merge join 접근 경 로 를 시도 합 니 다. 그 중 두 관 계 는 명시 적 정렬 을 실행 해 야 합 니 다. 주: joinrel - > pathlist 는 실행 전에 NULL 이 고 실행 후 접근 경 로 를 생 성 합 니 다.
(gdb) p *joinrel->pathlist
Cannot access memory at address 0x0
(gdb) n
269     sort_inner_and_outer(root, joinrel, outerrel, innerrel,
(gdb) 
279   if (mergejoin_allowed)
(gdb) p *joinrel->pathlist
$50 = {type = T_List, length = 1, head = 0x1d13850, tail = 0x1d13850}

기타 구현 논리 유사, sortinner_and_outer 등 함수 의 실현 논 리 는 다음 에 상세 하 게 해석 합 니 다. 최종 결 과 는 2 개의 방문 경 로 를 생 성하 여 pathlist 링크 에 저장 합 니 다.
324   if (set_join_pathlist_hook)
(gdb) 
327 }
(gdb) p *joinrel->pathlist
$51 = {type = T_List, length = 2, head = 0x1d13850, tail = 0x1d13930}
(gdb) p *(Node *)joinrel->pathlist->head->data.ptr_value
$52 = {type = T_HashPath}
(gdb) p *(HashPath *)joinrel->pathlist->head->data.ptr_value
$53 = {jpath = {path = {type = T_HashPath, pathtype = T_HashJoin, parent = 0x1d131b8, pathtarget = 0x1d133c8, 
      param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 99850, 
      startup_cost = 3762, total_cost = 10075.348750000001, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = false, 
    outerjoinpath = 0x1d11f48, innerjoinpath = 0x1d0f548, joinrestrictinfo = 0x1d135e8}, path_hashclauses = 0x1d13aa0, 
  num_batches = 2, inner_rows_total = 100000}
(gdb) p *(Node *)joinrel->pathlist->head->next->data.ptr_value
$54 = {type = T_NestPath}
(gdb) p *(NestPath *)joinrel->pathlist->head->next->data.ptr_value
$55 = {path = {type = T_NestPath, pathtype = T_NestLoop, parent = 0x1d131b8, pathtarget = 0x1d133c8, param_info = 0x0, 
    parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 99850, startup_cost = 39.801122856046675, 
    total_cost = 41318.966172885761, pathkeys = 0x1d0b818}, jointype = JOIN_INNER, inner_unique = false, 
  outerjoinpath = 0x1d119d8, innerjoinpath = 0x1d0f9d8, joinrestrictinfo = 0x0}

DONE!
참고 자료
allpaths.c cost.h costsize.c PG Document:Query Planning

좋은 웹페이지 즐겨찾기